www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Re: Poll on improved format strings.

reply renoX <renosky free.fr> writes:
Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.
 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.
 
 So how about something like this:
 
 Expansion ::= "$" ExpansionSpec
 
 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr
 
 FormattingOptions ::= "(" string ")"
 
 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier
 
 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept). $somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.
 Plus, if you discount the formatting stuff out the front, it's roughly
 comparable to how variable expansions are written in bash and the like.
  I also think that Nemerle (which has had this sort of compile-time
 printf stuff for ages) does it the same way.
 
 As for the spec itself: it should be const char[] only, and display a
 meaningful error if the programmer tries to pass a non-const char[].

For the const char[] only, I agree. The problem for the 'meaningful error' is that D doesn't provide a way to print the line number of where the template was called..
 That said, I think you should also provide a "run-time" version of the
 function that has the exact same parser, formatting, etc., but the user
 can pass one or more hash maps to the function.  This would allow people
 to use the same format for both compile and runtime, whilst not making
 the runtime version a security risk (well, aside from arbitrary
 expressions, anyway).  For example:
 
 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above


Python has this string manipulation possibility with the associative array, I think. What is supposed to happen if one key belongs to several associative arrays? Or if it doesn't exist in all the associative arrays provided?
 	-- Daniel
 
 P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
 string out complete with escaping and quotation marks :P

I don't understand the difference between %q and %s. Regards, renoX
Mar 07 2007
next sibling parent reply Don Clugston <dac nospam.com.au> writes:
renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.
 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept). $somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.

I agree. You'd also need to worry about "%f{abc}" -- is this a local variable called 'f', followed by the text "{abc}", or is it printing a floating point number called abc ?
Mar 07 2007
next sibling parent renoX <renosky free.fr> writes:
Don Clugston Wrote:
 renoX wrote:
 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

I agree. You'd also need to worry about "%f{abc}" -- is this a local variable called 'f', followed by the text "{abc}", or is it printing a floating point number called abc ?

Good remark. One think which is a problem with the syntax of apprend {var} after printf format string is that it doesn't work for the equivalent of writef("%*d",width,var) maybe %*{width}{var}? renoX
Mar 07 2007
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Don Clugston wrote:
 renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.
 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept). $somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.

I agree. You'd also need to worry about "%f{abc}" -- is this a local variable called 'f', followed by the text "{abc}", or is it printing a floating point number called abc ?

All that trouble to post a BNF grammar, wasted! :P 'f' is a valid identifier, so it's printing a variable called 'f', followed by the text '{abc}'. Incidentally, I used '$' so that people wouldn't confuse it with a C printf format string (so the third possible case doesn't really count in my version :P). -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 07 2007
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.
 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept).

(What, and having type inference, templates, aa's, compile-time function evaluation, et al. isn't going to surprise them?! :P This is incidentally WHY I picked "$": so that they don't think "Hey; this is a printf format string!)
 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

I'm sure bash and PHP programmers would disagree with you. :P I chose the syntax precisely because there are languages out there that do the *exact* same thing: the only thing that's new is the (opt) part. You want to cater to C programmers, and that's fine; I'm more interested in coming from a different angle (mostly because I like how these look).
 Plus, if you discount the formatting stuff out the front, it's roughly
 comparable to how variable expansions are written in bash and the like.
  I also think that Nemerle (which has had this sort of compile-time
 printf stuff for ages) does it the same way.

 As for the spec itself: it should be const char[] only, and display a
 meaningful error if the programmer tries to pass a non-const char[].

For the const char[] only, I agree. The problem for the 'meaningful error' is that D doesn't provide a way to print the line number of where the template was called..

A trick might be to use a... hmm... I wonder, if your function returned, say "static assert(false, \"Ruh-rho!\")" to the mixin(...) keyword, which line would the compiler say it was on? :P
 That said, I think you should also provide a "run-time" version of the
 function that has the exact same parser, formatting, etc., but the user
 can pass one or more hash maps to the function.  This would allow people
 to use the same format for both compile and runtime, whilst not making
 the runtime version a security risk (well, aside from arbitrary
 expressions, anyway).  For example:

 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above


Python has this string manipulation possibility with the associative array, I think. What is supposed to happen if one key belongs to several associative arrays? Or if it doesn't exist in all the associative arrays provided?

For the first, it simply uses the first one it finds. Actually, this design is bad because of the multiple aa thing, but I only did that because Python is dynamically typed. And there are cases where being able to dump multiple types into a format string is very handy. For the second, it's clearly a run-time error; throw an exception. One possible alternative is to have a slightly different version which will only fill in the fields it can, and leave the others.
 	-- Daniel

 P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
 string out complete with escaping and quotation marks :P

I don't understand the difference between %q and %s. Regards, renoX

auto msg = "This is a \"string\".\n\tIt has some escaped characters.";
 writefln("$(q)msg", msg);
 writefln("$(s)msg", msg);

Produces:
 This is a "string".
         It has some escaped characters.
 "This is a \"string\".\n\tIt has some escaped characters."

This is, admittedly, more useful in Lua which has an eval function :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 07 2007
parent renoX <renosky free.fr> writes:
Daniel Keep a écrit :
 
 renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

%{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept).

(What, and having type inference, templates, aa's, compile-time function evaluation, et al. isn't going to surprise them?! :P This is incidentally WHY I picked "$": so that they don't think "Hey; this is a printf format string!)
 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

I'm sure bash and PHP programmers would disagree with you. :P I chose the syntax precisely because there are languages out there that do the *exact* same thing: the only thing that's new is the (opt) part.

Note that in Ruby, a scripting language also the syntax is "...#{<var>}..." of course there were some requests to allow #var but so far Matz has rejected them (I think, I don't follow Ruby closely).
 You want to cater to C programmers, and that's fine; I'm more interested
 in coming from a different angle (mostly because I like how these look).

Well currently D is using C-style format strings, so I'm not sure what the gain would be to change the syntax.
 Plus, if you discount the formatting stuff out the front, it's roughly
 comparable to how variable expansions are written in bash and the like.
  I also think that Nemerle (which has had this sort of compile-time
 printf stuff for ages) does it the same way.

 As for the spec itself: it should be const char[] only, and display a
 meaningful error if the programmer tries to pass a non-const char[].

The problem for the 'meaningful error' is that D doesn't provide a way to print the line number of where the template was called..

A trick might be to use a... hmm... I wonder, if your function returned, say "static assert(false, \"Ruh-rho!\")" to the mixin(...) keyword, which line would the compiler say it was on? :P

Interesting suggestion. I'll try it.
 That said, I think you should also provide a "run-time" version of the
 function that has the exact same parser, formatting, etc., but the user
 can pass one or more hash maps to the function.  This would allow people
 to use the same format for both compile and runtime, whilst not making
 the runtime version a security risk (well, aside from arbitrary
 expressions, anyway).  For example:

 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above


What is supposed to happen if one key belongs to several associative arrays? Or if it doesn't exist in all the associative arrays provided?

For the first, it simply uses the first one it finds. Actually, this design is bad because of the multiple aa thing, but I only did that because Python is dynamically typed. And there are cases where being able to dump multiple types into a format string is very handy. For the second, it's clearly a run-time error; throw an exception. One possible alternative is to have a slightly different version which will only fill in the fields it can, and leave the others.
 	-- Daniel

 P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
 string out complete with escaping and quotation marks :P

Regards, renoX

auto msg = "This is a \"string\".\n\tIt has some escaped characters.";
 writefln("$(q)msg", msg);
 writefln("$(s)msg", msg);

Produces:
 This is a "string".
         It has some escaped characters.
 "This is a \"string\".\n\tIt has some escaped characters."

This is, admittedly, more useful in Lua which has an eval function :P -- Daniel

OK, thanks for the information. renoX
Mar 07 2007