www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - write, toString, formatValue & range interface

reply spir <denis.spir gmail.com> writes:
Hello,


Had a nice time degugging an issue after having added an input range interf=
ace to a (big) struct type. Finally managed to deduce the problem happens w=
hen writing out an element of the struct type. This introduced an infinite =
loop ending in segfault. Found it weird because the struct's toString does =
not iterate over the type, so there was no reason to use the range interfac=
e.
This is why I guessed toString was not called. And in fact, forcing its use=
 by explicitely calling .toString() solved the bug! 2 correspondants (Steph=
an Mueller & Ivan Melnychuk) helped me by pointing to the various template-=
selection criteria of formatValue.



There seems to be a pair of bugs in the set of formatValue templates consta=
ints, which cause the following problems:
* If a class defines both toString and a range interface, compiler error (a=
dditional bug pointed by Stephan Mueller).
* For structs, the presence of a range interface shortcuts toString.
* If a range outputs elements of the same type, writing (and probably other=
 features) runs into an infinite loop. This case is unchecked yet.



The following changes may, I guess, solve the first two problems:
(1) structs added (with classes) to the template selecting the use of toStr=
ing
(2) the template that selects the use of ranges checks there is no toString
(3) the special case of using t.stringof for stucts must be selected only i=
n last resort -- actually, this case may be suppressed and integrated into =
the general class/struct case.

This means changing the following formatValue templates (quickly written, a=
bsolutely untested ;-):

// case use toString (or struct .stringof): add structs
void formatValue(Writer, T, Char)(Writer w, T val, ref FormatSpec!Char f)
if (is(T =3D=3D class) || is(T =3D=3D struct))
{
    // in case of struct, detect whether toString is defined, else use T.st=
ringof
}

// case use range interface: check no toString available
// also add a test that the range does not output elements of the same type=
!!!
void formatValue(Writer, T, Char)(Writer w, T val,
        ref FormatSpec!Char f)
if (
    isInputRange!T && !isSomeChar!(ElementType!T) ||
    ! is(typeof(val.toString() =3D=3D string))
)
{...}

// special case use T.stringof for struct: useless? (else check no toString)
void formatValue(Writer, T, Char)(Writer w, T val,
        ref FormatSpec!Char f)
if (
    is(T =3D=3D struct) && !isInputRange!T &&
    ! is(typeof(val.toString() =3D=3D string))
)
{
    put(w, T.stringof);
}



Also, (1) the online doc of std.format seems outdated, no constraint for in=
stance (2) the in-source doc is rather confusing, several comments do not d=
escribe the following code.



Hope this helps,
Denis
-- -- -- -- -- -- --
vit esse estrany =E2=98=A3

spir.wikidot.com
Dec 14 2010
next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 14 Dec 2010 05:02:41 -0500, spir <denis.spir gmail.com> wrote:
 Hello,


 Had a nice time degugging an issue after having added an input range  
 interface to a (big) struct type. Finally managed to deduce the problem  
 happens when writing out an element of the struct type. This introduced  
 an infinite loop ending in segfault. Found it weird because the struct's  
 toString does not iterate over the type, so there was no reason to use  
 the range interface.
 This is why I guessed toString was not called. And in fact, forcing its  
 use by explicitely calling .toString() solved the bug! 2 correspondants  
 (Stephan Mueller & Ivan Melnychuk) helped me by pointing to the various  
 template-selection criteria of formatValue.



 There seems to be a pair of bugs in the set of formatValue templates  
 constaints, which cause the following problems:
 * If a class defines both toString and a range interface, compiler error  
 (additional bug pointed by Stephan Mueller).
 * For structs, the presence of a range interface shortcuts toString.
 * If a range outputs elements of the same type, writing (and probably  
 other features) runs into an infinite loop. This case is unchecked yet.



 The following changes may, I guess, solve the first two problems:
 (1) structs added (with classes) to the template selecting the use of  
 toString
 (2) the template that selects the use of ranges checks there is no  
 toString
 (3) the special case of using t.stringof for stucts must be selected  
 only in last resort -- actually, this case may be suppressed and  
 integrated into the general class/struct case.

 This means changing the following formatValue templates (quickly  
 written, absolutely untested ;-):

 // case use toString (or struct .stringof): add structs
 void formatValue(Writer, T, Char)(Writer w, T val, ref FormatSpec!Char f)
 if (is(T == class) || is(T == struct))
 {
     // in case of struct, detect whether toString is defined, else use  
 T.stringof
 }

 // case use range interface: check no toString available
 // also add a test that the range does not output elements of the same  
 type!!!
 void formatValue(Writer, T, Char)(Writer w, T val,
         ref FormatSpec!Char f)
 if (
     isInputRange!T && !isSomeChar!(ElementType!T) ||
     ! is(typeof(val.toString() == string))
 )
 {...}

 // special case use T.stringof for struct: useless? (else check no  
 toString)
 void formatValue(Writer, T, Char)(Writer w, T val,
         ref FormatSpec!Char f)
 if (
     is(T == struct) && !isInputRange!T &&
     ! is(typeof(val.toString() == string))
 )
 {
     put(w, T.stringof);
 }



 Also, (1) the online doc of std.format seems outdated, no constraint for  
 instance (2) the in-source doc is rather confusing, several comments do  
 not describe the following code.



 Hope this helps,
 Denis
 -- -- -- -- -- -- --
 vit esse estrany ☣

 spir.wikidot.com

Having recently run into this without knowing it, vote++. Also, please file a bug report (or two).
Dec 14 2010
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On Tue, 14 Dec 2010 09:35:20 -0500
"Robert Jacques" <sandford jhu.edu> wrote:

 Having recently run into this without knowing it,

Which one? (This issue causes 3 distinct bugs.)
 vote++. Also, please =20
 file a bug report (or two).

Done: http://d.puremagic.com/issues/show_bug.cgi?id=3D5354 -- see also bel= ow Denis =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D text of = issue report =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D formatValue: range templates introduce 3 bugs related to class & struct cas= es This issue concerns class case, the struct case, and the 3 range cases of t= he set of formatValue templates in std.format. As this set is currently wri= tten and commented (1), it seems to be intended to determine the following = cases (about class/struct/range only): * An input range is formatted like an array. * A class object is formatted using toString. * A struct is formatted: ~ using an input range interface, if it implements one, ~ using toString, if it defines it, ~ in last resort, using the type's 'stringof' property. To be short: I think the right thing to do is to remove range cases. Explan= ations, details, & reasoning below. In the way the set of templates is presently implemented, and because of ho= w template selection works (as opposed to inheritance, eg), the following 3= bugs come up: 1. When a class defines an input range, compiler-error due to the fact that= both class and input range cases match: /usr/include/d/dmd/phobos/std/format.d(1404): Error: template std.forma= t.formatValue(Writer,T,Char) if (is(const(T) =3D=3D const(void[]))) formatV= alue(Writer,T,Char) if (is(const(T) =3D=3D const(void[]))) matches more tha= n one template declaration, /usr/include/d/dmd/phobos/std/format.d(1187):fo= rmatValue(Writer,T,Char) if (isInputRange!(T) && !isSomeString!(T) && isSom= eChar!(ElementType!(T))) and /usr/include/d/dmd/phobos/std/format.d(1260):f= ormatValue(Writer,T,Char) if (is(T =3D=3D class)) This, due to inheritance from Object, even if no toString is _explicitely_ = defined. 2. For a struct, a programmer-defined output format in toString is shortcut= if ever the struct implements a range interface! 3. If a range's element type (result type of front) is identical to the ran= ge's own type, writing runs into an infinite loop... This is well possible,= for instance a textual type working like strings in high-level/dynamic lan= guages (a character is a singleton string). To solve these bugs, I guess the following changes would have to be done: * The 3 ranges case must have 2 additional _negative_ constraints: ~ no toString defined on the type ~ (ElementType!T !=3D T) * The struct case must be split in 2 sub-cases: ~ use toString if defined ~ [else use range if defined, as given above] ~ if neither toString nore range, use T.stringof I have tried to implement and test this modif, but ran into build errors (s= eemingly unrelated, about isTuple) I could not solve. Now, I think it is worth wondering whether all these complications, only to= have _default_ formatValue's for input ranges, is worth it at all. On one = hand, in view of the analogy, it looks like a nice idea to have them expres= sed like arrays. On the other, when can this feature be useful? An first issue comes up because there is no way, AFAIK, to tell apart inher= ited and explicite toString methods of classes: is(typeof(val.toString() = =3D=3D string)) is always true for a class. So that the range case would ne= ver be triggered for classes -- only for structs. So, to use this feature, (1) the type must be a struct (2) which defines no= toString (3) whch implements a range interface, and (4) the range's elemen= t type must not be the range type itself. In addition, the most sensible ou= tput form for it should be precisely the one of an array. Note that unlike for structs, programmers cannot define custom forms of arr= ay output ;-) This is the reason why a default array format is so helpful -= - but this reason does not exist for structs, thank to toString (and later = writeTo). If no default form exists for ranges, then in the rare cases where a progra= mmer would implement a range interface on a struct _and_ need to re-create = an array-like format for it, this takes a few lines in toString, for instan= ce: string toString () { string[] contents =3D new string[this.elements.length];=20 foreach (i,e ; this.elements) contents[i] =3D to!string(this.elements[i]); return format("[%s]", join(contents, ", ")); } As a conclusion, I would recommend to get rid of the (3) range cases in the= set of formatValue templates. (This would directly restore correctness, I = guess --showing that range cases where probably added later.) I marked the bug(s) with keyword 'spec', as it depends on: how do we want struct/class/range formatting semantics to be? (1) There is at least a doc/comment error, namely for the struct case (comm= entted as AA instead). Also, the online doc does not hold template constrai= nts, so that it is not possible to determine which one is selected in given= situations. -- -- -- -- -- -- -- vit esse estrany =E2=98=A3 spir.wikidot.com
Dec 15 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 15 Dec 2010 04:48:43 -0500, spir <denis.spir gmail.com> wrote:

 On Tue, 14 Dec 2010 09:35:20 -0500
 "Robert Jacques" <sandford jhu.edu> wrote:

 Having recently run into this without knowing it,

Which one? (This issue causes 3 distinct bugs.)

A struct + opDispatch combination resulted in a compile time error for me. Of course, since the opDispatch routine returned itself, I figure an infinite loop would have occurred if it did compile. Right now, I've added template constraints, but it is a suboptimal solution to the problem.
Dec 15 2010
prev sibling parent "Nick Voronin" <elfy.nv gmail.com> writes:
I guess the discussion on bugzilla strayed quite far from original point,  
so I'll answer here.

 I don't understand you point about "Now if we bundle data and Range  
 interface
 together all kind of funny things happen." A type that implement a range  
 always
 holds data, usually provides many other features that just  
 range/iteration,

Look into std/container.d. Containers hold data, Ranges traverse data (and are separate struct types). Think, how would you generally hold data in something which exhaust itself on iteration? There is save(), but then again it may not be there :) And it creates _new_ range object. See, all kind of complications from mixing things, when they are intended to simplify programing. -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Dec 15 2010