digitalmars.D - Re: Poll on improved format strings.
- renoX <renosky free.fr> Mar 07 2007
- Don Clugston <dac nospam.com.au> Mar 07 2007
- renoX <renosky free.fr> Mar 07 2007
- Daniel Keep <daniel.keep.lists gmail.com> Mar 07 2007
- Daniel Keep <daniel.keep.lists gmail.com> Mar 07 2007
- renoX <renosky free.fr> Mar 07 2007
Daniel Keep Wrote:To be honest, I think the type suffix needs to go. After all, if you know what the type is at compile-time, why do I need to repeat it?
If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.Of course, doing that leaves you with the problem of how to specify formatting options... but then in the majority of cases, you probably don't care; you just want the thing output. So how about something like this: Expansion ::= "$" ExpansionSpec ExpansionSpec ::= FormattingOptions ExpansionExpr ExpansionSpec ::= ExpansionExpr FormattingOptions ::= "(" string ")" ExpansionExpr ::= "{" D_Expression "}" ExpansionExpr ::= D_Identifier So the example above becomes "... $(08){var1+var2} ...": one character longer, but gives you more freedom as to what you can put in the formatting options. Plus, if you don't care how it's formatted, you can use "... ${var1+var2} ...", and if you just want to print a variable out, you can use "... $somevar ...".
Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept). $somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.Plus, if you discount the formatting stuff out the front, it's roughly comparable to how variable expansions are written in bash and the like. I also think that Nemerle (which has had this sort of compile-time printf stuff for ages) does it the same way. As for the spec itself: it should be const char[] only, and display a meaningful error if the programmer tries to pass a non-const char[].
For the const char[] only, I agree. The problem for the 'meaningful error' is that D doesn't provide a way to print the line number of where the template was called..That said, I think you should also provide a "run-time" version of the function that has the exact same parser, formatting, etc., but the user can pass one or more hash maps to the function. This would allow people to use the same format for both compile and runtime, whilst not making the runtime version a security risk (well, aside from arbitrary expressions, anyway). For example:auto author = "renoX"; auto d_bdfl = "Walter Bright"; auto life = 42; mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, " "Meaning of life: $life")); // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42 char[][char[]] strings; int[char[]] ints; strings["author"] = author; strings["d-bdfl"] = d_bdfl; ints["life"] = life; auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, " "Meaning of life: $life"; writefln(formatstr, strings, ints); // Prints the same thing as above
Python has this string manipulation possibility with the associative array, I think. What is supposed to happen if one key belongs to several associative arrays? Or if it doesn't exist in all the associative arrays provided?-- Daniel P.S. $(q){...} is stolen from Lua's "%q" format specifier: prints a string out complete with escaping and quotation marks :P
I don't understand the difference between %q and %s. Regards, renoX
Mar 07 2007
renoX wrote:Daniel Keep Wrote:To be honest, I think the type suffix needs to go. After all, if you know what the type is at compile-time, why do I need to repeat it?
If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.Of course, doing that leaves you with the problem of how to specify formatting options... but then in the majority of cases, you probably don't care; you just want the thing output. So how about something like this: Expansion ::= "$" ExpansionSpec ExpansionSpec ::= FormattingOptions ExpansionExpr ExpansionSpec ::= ExpansionExpr FormattingOptions ::= "(" string ")" ExpansionExpr ::= "{" D_Expression "}" ExpansionExpr ::= D_Identifier So the example above becomes "... $(08){var1+var2} ...": one character longer, but gives you more freedom as to what you can put in the formatting options. Plus, if you don't care how it's formatted, you can use "... ${var1+var2} ...", and if you just want to print a variable out, you can use "... $somevar ...".
Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept). $somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.
I agree. You'd also need to worry about "%f{abc}" -- is this a local variable called 'f', followed by the text "{abc}", or is it printing a floating point number called abc ?
Mar 07 2007
Don Clugston Wrote:renoX wrote:$somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.
I agree. You'd also need to worry about "%f{abc}" -- is this a local variable called 'f', followed by the text "{abc}", or is it printing a floating point number called abc ?
Good remark. One think which is a problem with the syntax of apprend {var} after printf format string is that it doesn't work for the equivalent of writef("%*d",width,var) maybe %*{width}{var}? renoX
Mar 07 2007
Don Clugston wrote:renoX wrote:Daniel Keep Wrote:To be honest, I think the type suffix needs to go. After all, if you know what the type is at compile-time, why do I need to repeat it?
If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.Of course, doing that leaves you with the problem of how to specify formatting options... but then in the majority of cases, you probably don't care; you just want the thing output. So how about something like this: Expansion ::= "$" ExpansionSpec ExpansionSpec ::= FormattingOptions ExpansionExpr ExpansionSpec ::= ExpansionExpr FormattingOptions ::= "(" string ")" ExpansionExpr ::= "{" D_Expression "}" ExpansionExpr ::= D_Identifier So the example above becomes "... $(08){var1+var2} ...": one character longer, but gives you more freedom as to what you can put in the formatting options. Plus, if you don't care how it's formatted, you can use "... ${var1+var2} ...", and if you just want to print a variable out, you can use "... $somevar ...".
Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept). $somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.
I agree. You'd also need to worry about "%f{abc}" -- is this a local variable called 'f', followed by the text "{abc}", or is it printing a floating point number called abc ?
All that trouble to post a BNF grammar, wasted! :P 'f' is a valid identifier, so it's printing a variable called 'f', followed by the text '{abc}'. Incidentally, I used '$' so that people wouldn't confuse it with a C printf format string (so the third possible case doesn't really count in my version :P). -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 07 2007
renoX wrote:Daniel Keep Wrote:To be honest, I think the type suffix needs to go. After all, if you know what the type is at compile-time, why do I need to repeat it?
If the compatibility with printf(allowing %d without {...} format string) is removed (and I think I'll do this) then %{var} would be also allowed.Of course, doing that leaves you with the problem of how to specify formatting options... but then in the majority of cases, you probably don't care; you just want the thing output. So how about something like this: Expansion ::= "$" ExpansionSpec ExpansionSpec ::= FormattingOptions ExpansionExpr ExpansionSpec ::= ExpansionExpr FormattingOptions ::= "(" string ")" ExpansionExpr ::= "{" D_Expression "}" ExpansionExpr ::= D_Identifier So the example above becomes "... $(08){var1+var2} ...": one character longer, but gives you more freedom as to what you can put in the formatting options. Plus, if you don't care how it's formatted, you can use "... ${var1+var2} ...", and if you just want to print a variable out, you can use "... $somevar ...".
Printf-Format string are quite powerful, and they're well known so I think their syntax should be kept, just with the obligation to follow by {...}: so the following would work: %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept).
(What, and having type inference, templates, aa's, compile-time function evaluation, et al. isn't going to surprise them?! :P This is incidentally WHY I picked "$": so that they don't think "Hey; this is a printf format string!)$somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.
I'm sure bash and PHP programmers would disagree with you. :P I chose the syntax precisely because there are languages out there that do the *exact* same thing: the only thing that's new is the (opt) part. You want to cater to C programmers, and that's fine; I'm more interested in coming from a different angle (mostly because I like how these look).Plus, if you discount the formatting stuff out the front, it's roughly comparable to how variable expansions are written in bash and the like. I also think that Nemerle (which has had this sort of compile-time printf stuff for ages) does it the same way. As for the spec itself: it should be const char[] only, and display a meaningful error if the programmer tries to pass a non-const char[].
For the const char[] only, I agree. The problem for the 'meaningful error' is that D doesn't provide a way to print the line number of where the template was called..
A trick might be to use a... hmm... I wonder, if your function returned, say "static assert(false, \"Ruh-rho!\")" to the mixin(...) keyword, which line would the compiler say it was on? :PThat said, I think you should also provide a "run-time" version of the function that has the exact same parser, formatting, etc., but the user can pass one or more hash maps to the function. This would allow people to use the same format for both compile and runtime, whilst not making the runtime version a security risk (well, aside from arbitrary expressions, anyway). For example:auto author = "renoX"; auto d_bdfl = "Walter Bright"; auto life = 42; mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, " "Meaning of life: $life")); // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42 char[][char[]] strings; int[char[]] ints; strings["author"] = author; strings["d-bdfl"] = d_bdfl; ints["life"] = life; auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, " "Meaning of life: $life"; writefln(formatstr, strings, ints); // Prints the same thing as above
Python has this string manipulation possibility with the associative array, I think. What is supposed to happen if one key belongs to several associative arrays? Or if it doesn't exist in all the associative arrays provided?
For the first, it simply uses the first one it finds. Actually, this design is bad because of the multiple aa thing, but I only did that because Python is dynamically typed. And there are cases where being able to dump multiple types into a format string is very handy. For the second, it's clearly a run-time error; throw an exception. One possible alternative is to have a slightly different version which will only fill in the fields it can, and leave the others.-- Daniel P.S. $(q){...} is stolen from Lua's "%q" format specifier: prints a string out complete with escaping and quotation marks :P
I don't understand the difference between %q and %s. Regards, renoX
auto msg = "This is a \"string\".\n\tIt has some escaped characters.";writefln("$(q)msg", msg); writefln("$(s)msg", msg);
Produces:This is a "string". It has some escaped characters. "This is a \"string\".\n\tIt has some escaped characters."
This is, admittedly, more useful in Lua which has an eval function :P -- Daniel -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
Mar 07 2007
Daniel Keep a écrit :renoX wrote:Daniel Keep Wrote:To be honest, I think the type suffix needs to go. After all, if you know what the type is at compile-time, why do I need to repeat it?
Of course, doing that leaves you with the problem of how to specify formatting options... but then in the majority of cases, you probably don't care; you just want the thing output. So how about something like this: Expansion ::= "$" ExpansionSpec ExpansionSpec ::= FormattingOptions ExpansionExpr ExpansionSpec ::= ExpansionExpr FormattingOptions ::= "(" string ")" ExpansionExpr ::= "{" D_Expression "}" ExpansionExpr ::= D_Identifier So the example above becomes "... $(08){var1+var2} ...": one character longer, but gives you more freedom as to what you can put in the formatting options. Plus, if you don't care how it's formatted, you can use "... ${var1+var2} ...", and if you just want to print a variable out, you can use "... $somevar ...".
%{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I agree but to avoid surprising C programmers, they should be kept).
(What, and having type inference, templates, aa's, compile-time function evaluation, et al. isn't going to surprise them?! :P This is incidentally WHY I picked "$": so that they don't think "Hey; this is a printf format string!)$somevar or %somevar without {...}, I don't know, it's shorter, sure but less readable I think.
I'm sure bash and PHP programmers would disagree with you. :P I chose the syntax precisely because there are languages out there that do the *exact* same thing: the only thing that's new is the (opt) part.
Note that in Ruby, a scripting language also the syntax is "...#{<var>}..." of course there were some requests to allow #var but so far Matz has rejected them (I think, I don't follow Ruby closely).You want to cater to C programmers, and that's fine; I'm more interested in coming from a different angle (mostly because I like how these look).
Well currently D is using C-style format strings, so I'm not sure what the gain would be to change the syntax.Plus, if you discount the formatting stuff out the front, it's roughly comparable to how variable expansions are written in bash and the like. I also think that Nemerle (which has had this sort of compile-time printf stuff for ages) does it the same way. As for the spec itself: it should be const char[] only, and display a meaningful error if the programmer tries to pass a non-const char[].
The problem for the 'meaningful error' is that D doesn't provide a way to print the line number of where the template was called..
A trick might be to use a... hmm... I wonder, if your function returned, say "static assert(false, \"Ruh-rho!\")" to the mixin(...) keyword, which line would the compiler say it was on? :P
Interesting suggestion. I'll try it.That said, I think you should also provide a "run-time" version of the function that has the exact same parser, formatting, etc., but the user can pass one or more hash maps to the function. This would allow people to use the same format for both compile and runtime, whilst not making the runtime version a security risk (well, aside from arbitrary expressions, anyway). For example:auto author = "renoX"; auto d_bdfl = "Walter Bright"; auto life = 42; mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, " "Meaning of life: $life")); // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42 char[][char[]] strings; int[char[]] ints; strings["author"] = author; strings["d-bdfl"] = d_bdfl; ints["life"] = life; auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, " "Meaning of life: $life"; writefln(formatstr, strings, ints); // Prints the same thing as above
What is supposed to happen if one key belongs to several associative arrays? Or if it doesn't exist in all the associative arrays provided?
For the first, it simply uses the first one it finds. Actually, this design is bad because of the multiple aa thing, but I only did that because Python is dynamically typed. And there are cases where being able to dump multiple types into a format string is very handy. For the second, it's clearly a run-time error; throw an exception. One possible alternative is to have a slightly different version which will only fill in the fields it can, and leave the others.-- Daniel P.S. $(q){...} is stolen from Lua's "%q" format specifier: prints a string out complete with escaping and quotation marks :P
Regards, renoX
auto msg = "This is a \"string\".\n\tIt has some escaped characters.";writefln("$(q)msg", msg); writefln("$(s)msg", msg);
Produces:This is a "string". It has some escaped characters. "This is a \"string\".\n\tIt has some escaped characters."
This is, admittedly, more useful in Lua which has an eval function :P -- Daniel
OK, thanks for the information. renoX
Mar 07 2007