digitalmars.D - Poll on improved format strings.

renoX (13/13) Mar 06 2007 Hello,

Don Clugston (16/29) Mar 06 2007 Personally I'd rather get an error if I leave off the {}.

renoX (11/61) Mar 06 2007 And this has the benefit that %{var} is the 'default' embedded format

Derek Parnell (15/19) Mar 06 2007 I will not be using this style of string formatting.

renoX (12/30) Mar 06 2007 I haven't thought about localisation, I'll have to take a look how it's

janderson (13/17) Mar 06 2007 In any case I like the idea of having these string in-beded in the text.
Daniel Keep (38/74) Mar 06 2007 To be honest, I think the type suffix needs to go. After all, if you

Don Clugston (7/36) Mar 07 2007 When you use floating point, you want to specify the formatting options

janderson (7/17) Mar 07 2007 Most of the time I want the same number of decimal places. I think

Don Clugston (2/13) Mar 07 2007 I think you're right.

renoX (12/75) Mar 07 2007 Printf-Format string are quite powerful, and they're well known so I thi...

Don Clugston (4/36) Mar 07 2007 I agree. You'd also need to worry about "%f{abc}" -- is this a local

renoX (4/10) Mar 07 2007 Good remark.
Daniel Keep (12/56) Mar 07 2007 All that trouble to post a BNF grammar, wasted! :P

Daniel Keep (29/116) Mar 07 2007 (What, and having type inference, templates, aa's, compile-time function

renoX (9/129) Mar 07 2007 Note that in Ruby, a scripting language also the syntax is

renoX <renosky free.fr> writes:

Hello,

I've made a templated format string templates joined in attachment (this new
version is improved thanks to Frits van Bommel), but I'm not sure about the
syntax of the format string.

The idea is: printf format string are interesting because they are powerful but
they suck because the %d,%s, etc are in one part of the function and the
corresponding variable are in a different part of the function (Tango has the
same problem), writef improve this by allowing "... %d",var," ... %s",var2 but
it's still not ideal because in the gluing of the various strings, it's easy to
forget a space or a comma thus providing a not very good output.

So my idea would be to have embedded expression like this "... %08d{var1+var2}
...", but it's not easy to provide a good syntax/semantic, so I'd like some
remarks:

-Should the mix of printf format and new style format string be allowed? (It is
in the current implementation).
This has the advantage of nearly keeping  the compatibility, the problem is
with the format string "..%d{..." in printf this means a number followed by '{'
but with the new format this creates an error.
It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so
now '%d{' would need to be '%d%{', this has the inconvenient that it's not
possible to have the embedding format '%{var}' which would be the shortest
syntax..
Another possibility would be to say that if you want to have '... %d{ ...' one
need to write it has '.... %d',var,'{ ....', this would permit to have the
'..%{var}...' embedding syntax.

-What to do with non-const char[]?
They cannot be parsed by the template, so one possibility is to allow only
const char[] parameter or to allow non-const char[] and leave them alone (they
may contain printf-style format string). This is what the current
implementation is doing but I'm not sure if the added flexibility is not
confusing: const char[] can contain both printf-like format and 'new embedded
format' but non const char[] can only contain printf-like format string.


for these new format string..

I'd like some inputs to see if there is a majority in favour of one style or
the other..

renoX

Mar 06 2007

Don Clugston <dac nospam.com.au> writes:

renoX wrote:
 Hello,
 
 I've made a templated format string templates
 
 The idea is: printf format string are interesting because they are powerful
but they suck because the %d,%s, etc are in one part of the function and the
corresponding variable are in a different part of the function (Tango has the
same problem), writef improve this by allowing "... %d",var," ... %s",var2 but
it's still not ideal because in the gluing of the various strings, it's easy to
forget a space or a comma thus providing a not very good output.
 
 So my idea would be to have embedded expression like this "... %08d{var1+var2}
...", but it's not easy to provide a good syntax/semantic, so I'd like some
remarks:

I like this better than anything else I've ever seen.

 
 -Should the mix of printf format and new style format string be allowed? (It
is in the current implementation).
 This has the advantage of nearly keeping  the compatibility, the problem is
with the format string "..%d{..." in printf this means a number followed by '{'
but with the new format this creates an error.
 It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so
now '%d{' would need to be '%d%{', this has the inconvenient that it's not
possible to have the embedding format '%{var}' which would be the shortest
syntax..
 Another possibility would be to say that if you want to have '... %d{ ...' one
need to write it has '.... %d',var,'{ ....', this would permit to have the
'..%{var}...' embedding syntax.

Personally I'd rather get an error if I leave off the {}.
(I'm someone who uses {} inside printf debugging strings a lot, so it's 
far from compatible for me).
I think it's better to minimise features wherever possible.

 -What to do with non-const char[]?

Aren't they a security risk?
eg,
char [] a  = getFromUser();
writefln(a);

if a is "%d", you get an access violation. Apart from security, it just 
hides an insidious bug -- the code works fine until someone innocently 
enters a % followed by one of the allowable letters....
I've always thought that the first argument to printf() should be forced 
to be a string literal.
I would see it as an *advantage*, to only support const char [] !

Mar 06 2007

renoX <renosky free.fr> writes:

Don Clugston a �crit :
 renoX wrote:
 Hello,

 I've made a templated format string templates

 The idea is: printf format string are interesting because they are 
 powerful but they suck because the %d,%s, etc are in one part of the 
 function and the corresponding variable are in a different part of the 
 function (Tango has the same problem), writef improve this by allowing 
 "... %d",var," ... %s",var2 but it's still not ideal because in the 
 gluing of the various strings, it's easy to forget a space or a comma 
 thus providing a not very good output.

 So my idea would be to have embedded expression like this "... 
 %08d{var1+var2} ...", but it's not easy to provide a good 
 syntax/semantic, so I'd like some remarks:

 
 I like this better than anything else I've ever seen.
 
 -Should the mix of printf format and new style format string be 
 allowed? (It is in the current implementation).
 This has the advantage of nearly keeping  the compatibility, the 
 problem is with the format string "..%d{..." in printf this means a 
 number followed by '{' but with the new format this creates an error.
 It is possible to escape the '{' to allow this, ie to say that '%{' is 
 '{' so now '%d{' would need to be '%d%{', this has the inconvenient 
 that it's not possible to have the embedding format '%{var}' which 
 would be the shortest syntax..
 Another possibility would be to say that if you want to have '... %d{ 
 ...' one need to write it has '.... %d',var,'{ ....', this would 
 permit to have the '..%{var}...' embedding syntax.

 
 Personally I'd rather get an error if I leave off the {}.
 (I'm someone who uses {} inside printf debugging strings a lot, so it's 
 far from compatible for me).
 I think it's better to minimise features wherever possible.

And this has the benefit that %{var} is the 'default' embedded format 
string no need to use %s{var}.
 -What to do with non-const char[]?

 
 Aren't they a security risk?
 eg,
 char [] a  = getFromUser();
 writefln(a);

Yes, they might be a security risk if they are not 'sanitized' before usage.

 
 if a is "%d", you get an access violation. Apart from security, it just 
 hides an insidious bug -- the code works fine until someone innocently 
 enters a % followed by one of the allowable letters....
 I've always thought that the first argument to printf() should be forced 
 to be a string literal.
 I would see it as an *advantage*, to only support const char [] !

Interesting.
Thanks for sharing your opinion, it's true that supporting only
const char[] and the 'embedded format string' makes the usage of 
putf/sputf easier for the programmer..

I think that I will follow your ideas.

Regards,
renoX

Mar 06 2007

Derek Parnell <derek psych.ward> writes:

On Tue, 06 Mar 2007 09:53:48 -0500, renoX wrote:

 Hello,
 
 I've made a templated format string templates 

...

 I'd like some inputs to see if there is a majority in favour of one style or
the other..

I will not be using this style of string formatting. 

I like to have the format strings as something that the user supplies at
runtime so that they can control the output of messages to their users -
especially when considering multiple language support. 

  msg = Expand( getMsgLayout( msgno ), vars ...);
  Output (msg );

where the msgno is a key to a runtime lookup for the layout of the message
which is suitable for the langugage of the current user.

-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell

Mar 06 2007

renoX <renosky free.fr> writes:

Derek Parnell a �crit :
 On Tue, 06 Mar 2007 09:53:48 -0500, renoX wrote:
 
 Hello,

 I've made a templated format string templates 

 ...
 
 I'd like some inputs to see if there is a majority in favour of one style or
the other..

 
 I will not be using this style of string formatting. 
 
 I like to have the format strings as something that the user supplies at
 runtime so that they can control the output of messages to their users -
 especially when considering multiple language support. 

I haven't thought about localisation, I'll have to take a look how it's 
done currently in D to see if it's compatible.

   msg = Expand( getMsgLayout( msgno ), vars ...);
   Output (msg );

Well your scheme is simple to implement, but this isn't very readable..

The GNU localisation package use the text in the default language 
(usually but not necessarily English) as a key to find the translations 
in a localisation file.
Of course the problem is that it requires a parser to do this..

 where the msgno is a key to a runtime lookup for the layout of the message
 which is suitable for the langugage of the current user.

Note that both type of formating are useful, yours is for user 
interface, mine is for trace logs (which are not localised usually).

They are not necessarily incompatible, I'll have to think about it.

renoX

Mar 06 2007

janderson <askme me.com> writes:

renoX wrote:
 Hello,
 
 
 So my idea would be to have embedded expression like this "... %08d{var1+var2}
..."

In any case I like the idea of having these string in-beded in the text.

I've had to write run-time versions of this many times for localization 
and for designers.  In these cases (and I assume most games work this 
way) there is a separation between design and code, so this would 
simplify the process.  With this format you could just make all the 
variables visible to the write statement (ie player1, player2, enermy1 
ect...).

Of course in the cases I've dealt with, the designer/localizer should 
use the default string conversion for the given variable.  Possibly even 
rounding of floats should be specifiable outside and optionally changed 
inside.

-Joel

Mar 06 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

renoX wrote:
 Hello,
 
 I've made a templated format string templates joined in attachment (this new
version is improved thanks to Frits van Bommel), but I'm not sure about the
syntax of the format string.
 
 The idea is: printf format string are interesting because they are powerful
but they suck because the %d,%s, etc are in one part of the function and the
corresponding variable are in a different part of the function (Tango has the
same problem), writef improve this by allowing "... %d",var," ... %s",var2 but
it's still not ideal because in the gluing of the various strings, it's easy to
forget a space or a comma thus providing a not very good output.
 
 So my idea would be to have embedded expression like this "... %08d{var1+var2}
...", but it's not easy to provide a good syntax/semantic, so I'd like some
remarks:
 
 -Should the mix of printf format and new style format string be allowed? (It
is in the current implementation).
 This has the advantage of nearly keeping  the compatibility, the problem is
with the format string "..%d{..." in printf this means a number followed by '{'
but with the new format this creates an error.
 It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so
now '%d{' would need to be '%d%{', this has the inconvenient that it's not
possible to have the embedding format '%{var}' which would be the shortest
syntax..
 Another possibility would be to say that if you want to have '... %d{ ...' one
need to write it has '.... %d',var,'{ ....', this would permit to have the
'..%{var}...' embedding syntax.
 
 -What to do with non-const char[]?
 They cannot be parsed by the template, so one possibility is to allow only
const char[] parameter or to allow non-const char[] and leave them alone (they
may contain printf-style format string). This is what the current
implementation is doing but I'm not sure if the added flexibility is not
confusing: const char[] can contain both printf-like format and 'new embedded
format' but non const char[] can only contain printf-like format string.
 

for these new format string..
 
 I'd like some inputs to see if there is a majority in favour of one style or
the other..
 
 renoX

To be honest, I think the type suffix needs to go.  After all, if you
know what the type is at compile-time, why do I need to repeat it?

Of course, doing that leaves you with the problem of how to specify
formatting options... but then in the majority of cases, you probably
don't care; you just want the thing output.

So how about something like this:

Expansion ::= "$" ExpansionSpec

ExpansionSpec ::= FormattingOptions ExpansionExpr
ExpansionSpec ::= ExpansionExpr

FormattingOptions ::= "(" string ")"

ExpansionExpr ::= "{" D_Expression "}"
ExpansionExpr ::= D_Identifier

So the example above becomes "... $(08){var1+var2} ...": one character
longer, but gives you more freedom as to what you can put in the
formatting options.  Plus, if you don't care how it's formatted, you can
use "... ${var1+var2} ...", and if you just want to print a variable
out, you can use "... $somevar ...".

Plus, if you discount the formatting stuff out the front, it's roughly
comparable to how variable expansions are written in bash and the like.
 I also think that Nemerle (which has had this sort of compile-time
printf stuff for ages) does it the same way.

As for the spec itself: it should be const char[] only, and display a
meaningful error if the programmer tries to pass a non-const char[].
That said, I think you should also provide a "run-time" version of the
function that has the exact same parser, formatting, etc., but the user
can pass one or more hash maps to the function.  This would allow people
to use the same format for both compile and runtime, whilst not making
the runtime version a security risk (well, aside from arbitrary
expressions, anyway).  For example:

 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above

	-- Daniel

P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
string out complete with escaping and quotation marks :P

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Mar 06 2007

Don Clugston <dac nospam.com.au> writes:

Daniel Keep wrote:
 
 renoX wrote:
 Hello,

 I've made a templated format string templates joined in attachment (this new
version is improved thanks to Frits van Bommel), but I'm not sure about the
syntax of the format string.

 The idea is: printf format string are interesting because they are powerful
but they suck because the %d,%s, etc are in one part of the function and the
corresponding variable are in a different part of the function (Tango has the
same problem), writef improve this by allowing "... %d",var," ... %s",var2 but
it's still not ideal because in the gluing of the various strings, it's easy to
forget a space or a comma thus providing a not very good output.

 So my idea would be to have embedded expression like this "... %08d{var1+var2}
...", but it's not easy to provide a good syntax/semantic, so I'd like some
remarks:

 -Should the mix of printf format and new style format string be allowed? (It
is in the current implementation).
 This has the advantage of nearly keeping  the compatibility, the problem is
with the format string "..%d{..." in printf this means a number followed by '{'
but with the new format this creates an error.
 It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so
now '%d{' would need to be '%d%{', this has the inconvenient that it's not
possible to have the embedding format '%{var}' which would be the shortest
syntax..
 Another possibility would be to say that if you want to have '... %d{ ...' one
need to write it has '.... %d',var,'{ ....', this would permit to have the
'..%{var}...' embedding syntax.

 -What to do with non-const char[]?
 They cannot be parsed by the template, so one possibility is to allow only
const char[] parameter or to allow non-const char[] and leave them alone (they
may contain printf-style format string). This is what the current
implementation is doing but I'm not sure if the added flexibility is not
confusing: const char[] can contain both printf-like format and 'new embedded
format' but non const char[] can only contain printf-like format string.


for these new format string..

 I'd like some inputs to see if there is a majority in favour of one style or
the other..

 renoX

 
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

When you use floating point, you want to specify the formatting options 
almost every time -- do you want %f, %e, %g, or %a? And it's almost 
always necessary to specify the number of decimal places to use.
I display integers in hex pretty often, too.

Still, being able to leave all the formatting options out, and write:
"next=%{i+1}" is very appealing.

Mar 07 2007

janderson <askme me.com> writes:

Don Clugston wrote:
 Daniel Keep wrote:

 
 When you use floating point, you want to specify the formatting options 
 almost every time -- do you want %f, %e, %g, or %a? And it's almost 
 always necessary to specify the number of decimal places to use.
 I display integers in hex pretty often, too.

Most of the time I want the same number of decimal places.  I think 
being able to provide a default decimal place would be a good idea.

Perhaps it could be in the first part of the string.  Then it could just 
append it (to hide from design/localizers/myself) something like: 
"%.2f()" ~ "blar: %(value)". Or maybe "%.2f=default" ~ "blar: %(value)".

 
 Still, being able to leave all the formatting options out, and write:
 "next=%{i+1}" is very appealing.

Agreed.

Mar 07 2007

Don Clugston <dac nospam.com.au> writes:

janderson wrote:
 Don Clugston wrote:
 Daniel Keep wrote:

 When you use floating point, you want to specify the formatting 
 options almost every time -- do you want %f, %e, %g, or %a? And it's 
 almost always necessary to specify the number of decimal places to use.
 I display integers in hex pretty often, too.

 
 Most of the time I want the same number of decimal places.  I think 
 being able to provide a default decimal place would be a good idea.

I think you're right.

Mar 07 2007

renoX <renosky free.fr> writes:

Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

If the compatibility with printf(allowing %d without {...} format string) is
removed (and I think I'll do this) then %{var} would be also allowed.

 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.
 
 So how about something like this:
 
 Expansion ::= "$" ExpansionSpec
 
 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr
 
 FormattingOptions ::= "(" string ")"
 
 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier
 
 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

Printf-Format string are quite powerful, and they're well known so I think
their syntax should be kept, just with the obligation to follow by {...}: so
the following would work:
%{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I
agree but to avoid surprising C programmers, they should be kept).

$somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

 Plus, if you discount the formatting stuff out the front, it's roughly
 comparable to how variable expansions are written in bash and the like.
  I also think that Nemerle (which has had this sort of compile-time
 printf stuff for ages) does it the same way.
 
 As for the spec itself: it should be const char[] only, and display a
 meaningful error if the programmer tries to pass a non-const char[].

For the const char[] only, I agree.
The problem for the 'meaningful error' is that D doesn't provide a way to print
the line number of where the template was called..

 That said, I think you should also provide a "run-time" version of the
 function that has the exact same parser, formatting, etc., but the user
 can pass one or more hash maps to the function.  This would allow people
 to use the same format for both compile and runtime, whilst not making
 the runtime version a security risk (well, aside from arbitrary
 expressions, anyway).  For example:
 
 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above


Python has this string manipulation possibility with the associative array, I
think.
What is supposed to happen if one key belongs to several associative arrays? Or
if it doesn't exist in all the associative arrays provided?

 	-- Daniel
 
 P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
 string out complete with escaping and quotation marks :P

I don't understand the difference between %q and %s.

Regards,
renoX

Mar 07 2007

Don Clugston <dac nospam.com.au> writes:

renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

 
 If the compatibility with printf(allowing %d without {...} format string) is
removed (and I think I'll do this) then %{var} would be also allowed.
 
 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

 
 Printf-Format string are quite powerful, and they're well known so I think
their syntax should be kept, just with the obligation to follow by {...}: so
the following would work:
 %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I
agree but to avoid surprising C programmers, they should be kept).
 
 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

I agree. You'd also need to worry about "%f{abc}" -- is this a local 
variable called 'f', followed by the text "{abc}", or is it printing a 
floating point number called abc ?

Mar 07 2007

renoX <renosky free.fr> writes:

Don Clugston Wrote:
 renoX wrote:
 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

 
 I agree. You'd also need to worry about "%f{abc}" -- is this a local 
 variable called 'f', followed by the text "{abc}", or is it printing a 
 floating point number called abc ?

Good remark.
One think which is a problem with the syntax of apprend {var} after printf
format string is that it doesn't work for the equivalent of
writef("%*d",width,var) maybe %*{width}{var}?

renoX

Mar 07 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

Don Clugston wrote:
 renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

 If the compatibility with printf(allowing %d without {...} format
 string) is removed (and I think I'll do this) then %{var} would be
 also allowed.

 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

 Printf-Format string are quite powerful, and they're well known so I
 think their syntax should be kept, just with the obligation to follow
 by {...}: so the following would work:
 %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful
 anymore, I agree but to avoid surprising C programmers, they should be
 kept).

 $somevar or %somevar without {...}, I don't know, it's shorter, sure
 but less readable I think.

 
 I agree. You'd also need to worry about "%f{abc}" -- is this a local
 variable called 'f', followed by the text "{abc}", or is it printing a
 floating point number called abc ?

All that trouble to post a BNF grammar, wasted!  :P

'f' is a valid identifier, so it's printing a variable called 'f',
followed by the text '{abc}'.  Incidentally, I used '$' so that people
wouldn't confuse it with a C printf format string (so the third possible
case doesn't really count in my version :P).

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Mar 07 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

 
 If the compatibility with printf(allowing %d without {...} format string) is
removed (and I think I'll do this) then %{var} would be also allowed.
 
 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

 
 Printf-Format string are quite powerful, and they're well known so I think
their syntax should be kept, just with the obligation to follow by {...}: so
the following would work:
 %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I
agree but to avoid surprising C programmers, they should be kept).

(What, and having type inference, templates, aa's, compile-time function
evaluation, et al. isn't going to surprise them?! :P  This is
incidentally WHY I picked "$": so that they don't think "Hey; this is a
printf format string!)

 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

I'm sure bash and PHP programmers would disagree with you. :P  I chose
the syntax precisely because there are languages out there that do the
*exact* same thing: the only thing that's new is the (opt) part.

You want to cater to C programmers, and that's fine; I'm more interested
in coming from a different angle (mostly because I like how these look).

 Plus, if you discount the formatting stuff out the front, it's roughly
 comparable to how variable expansions are written in bash and the like.
  I also think that Nemerle (which has had this sort of compile-time
 printf stuff for ages) does it the same way.

 As for the spec itself: it should be const char[] only, and display a
 meaningful error if the programmer tries to pass a non-const char[].

 
 For the const char[] only, I agree.
 The problem for the 'meaningful error' is that D doesn't provide a way to
print the line number of where the template was called..

A trick might be to use a... hmm...  I wonder, if your function
returned, say "static assert(false, \"Ruh-rho!\")" to the mixin(...)
keyword, which line would the compiler say it was on? :P

 That said, I think you should also provide a "run-time" version of the
 function that has the exact same parser, formatting, etc., but the user
 can pass one or more hash maps to the function.  This would allow people
 to use the same format for both compile and runtime, whilst not making
 the runtime version a security risk (well, aside from arbitrary
 expressions, anyway).  For example:

 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above


 
 Python has this string manipulation possibility with the associative array, I
think.
 What is supposed to happen if one key belongs to several associative arrays?
Or if it doesn't exist in all the associative arrays provided?

For the first, it simply uses the first one it finds.  Actually, this
design is bad because of the multiple aa thing, but I only did that
because Python is dynamically typed.  And there are cases where being
able to dump multiple types into a format string is very handy.

For the second, it's clearly a run-time error; throw an exception.

One possible alternative is to have a slightly different version which
will only fill in the fields it can, and leave the others.

 	-- Daniel

 P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
 string out complete with escaping and quotation marks :P

 
 I don't understand the difference between %q and %s.
 
 Regards,
 renoX

auto msg = "This is a \"string\".\n\tIt has some escaped characters.";

 writefln("$(q)msg", msg);
 writefln("$(s)msg", msg);

Produces:

 This is a "string".
         It has some escaped characters.
 "This is a \"string\".\n\tIt has some escaped characters."

This is, admittedly, more useful in Lua which has an eval function :P

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Mar 07 2007

renoX <renosky free.fr> writes:

Daniel Keep a écrit :
 
 renoX wrote:
 Daniel Keep Wrote:
 To be honest, I think the type suffix needs to go.  After all, if you
 know what the type is at compile-time, why do I need to repeat it?

 If the compatibility with printf(allowing %d without {...} format string) is
removed (and I think I'll do this) then %{var} would be also allowed.

 Of course, doing that leaves you with the problem of how to specify
 formatting options... but then in the majority of cases, you probably
 don't care; you just want the thing output.

 So how about something like this:

 Expansion ::= "$" ExpansionSpec

 ExpansionSpec ::= FormattingOptions ExpansionExpr
 ExpansionSpec ::= ExpansionExpr

 FormattingOptions ::= "(" string ")"

 ExpansionExpr ::= "{" D_Expression "}"
 ExpansionExpr ::= D_Identifier

 So the example above becomes "... $(08){var1+var2} ...": one character
 longer, but gives you more freedom as to what you can put in the
 formatting options.  Plus, if you don't care how it's formatted, you can
 use "... ${var1+var2} ...", and if you just want to print a variable
 out, you can use "... $somevar ...".

 Printf-Format string are quite powerful, and they're well known so I think
their syntax should be kept, just with the obligation to follow by {...}: so
the following would work:
 %{var}, %08X{var}, etc.. (%d{var} and %s{var} wouldn't be useful anymore, I
agree but to avoid surprising C programmers, they should be kept).

 
 (What, and having type inference, templates, aa's, compile-time function
 evaluation, et al. isn't going to surprise them?! :P  This is
 incidentally WHY I picked "$": so that they don't think "Hey; this is a
 printf format string!)
 $somevar or %somevar without {...}, I don't know, it's shorter, sure but less
readable I think.

 
 I'm sure bash and PHP programmers would disagree with you. :P  I chose
 the syntax precisely because there are languages out there that do the
 *exact* same thing: the only thing that's new is the (opt) part.

Note that in Ruby, a scripting language also the syntax is 

far Matz has rejected them (I think, I don't follow Ruby closely).

 You want to cater to C programmers, and that's fine; I'm more interested
 in coming from a different angle (mostly because I like how these look).

Well currently D is using C-style format strings, so I'm not sure what 
the gain would be to change the syntax.

 Plus, if you discount the formatting stuff out the front, it's roughly
 comparable to how variable expansions are written in bash and the like.
  I also think that Nemerle (which has had this sort of compile-time
 printf stuff for ages) does it the same way.

 As for the spec itself: it should be const char[] only, and display a
 meaningful error if the programmer tries to pass a non-const char[].

 For the const char[] only, I agree.
 The problem for the 'meaningful error' is that D doesn't provide a way to
print the line number of where the template was called..

 
 A trick might be to use a... hmm...  I wonder, if your function
 returned, say "static assert(false, \"Ruh-rho!\")" to the mixin(...)
 keyword, which line would the compiler say it was on? :P

Interesting suggestion. I'll try it.

 That said, I think you should also provide a "run-time" version of the
 function that has the exact same parser, formatting, etc., but the user
 can pass one or more hash maps to the function.  This would allow people
 to use the same format for both compile and runtime, whilst not making
 the runtime version a security risk (well, aside from arbitrary
 expressions, anyway).  For example:

 auto author = "renoX";
 auto d_bdfl = "Walter Bright";
 auto life = 42;

 mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, "
     "Meaning of life: $life"));

 // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42

 char[][char[]] strings;
 int[char[]] ints;

 strings["author"] = author;
 strings["d-bdfl"] = d_bdfl;
 ints["life"] = life;

 auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, "
     "Meaning of life: $life";

 writefln(formatstr, strings, ints);

 // Prints the same thing as above


 Python has this string manipulation possibility with the associative array, I
think.
 What is supposed to happen if one key belongs to several associative arrays?
Or if it doesn't exist in all the associative arrays provided?

 
 For the first, it simply uses the first one it finds.  Actually, this
 design is bad because of the multiple aa thing, but I only did that
 because Python is dynamically typed.  And there are cases where being
 able to dump multiple types into a format string is very handy.
 
 For the second, it's clearly a run-time error; throw an exception.
 
 One possible alternative is to have a slightly different version which
 will only fill in the fields it can, and leave the others.
 
 	-- Daniel

 P.S.  $(q){...} is stolen from Lua's "%q" format specifier: prints a
 string out complete with escaping and quotation marks :P

 I don't understand the difference between %q and %s.

 Regards,
 renoX

 
 auto msg = "This is a \"string\".\n\tIt has some escaped characters.";
 
 writefln("$(q)msg", msg);
 writefln("$(s)msg", msg);

 
 Produces:
 
 This is a "string".
         It has some escaped characters.
 "This is a \"string\".\n\tIt has some escaped characters."

 
 This is, admittedly, more useful in Lua which has an eval function :P
 
 	-- Daniel

OK, thanks for the information.
renoX

Mar 07 2007

D Programming

C/C++ Programming

Other

digitalmars.D - Poll on improved format strings.