www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4458] New: Static typing for format strings, when possible

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458

           Summary: Static typing for format strings, when possible
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Keywords: diagnostic
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc


--- Comment #0 from bearophile_hugs eml.cc 2010-07-14 04:38:11 PDT ---
In some situations the format string of writef/writefln is not known at
compile-time, but it most situations it is. So in the frequent cases where the
format string is known at compile-time I'd like an error at compile-time if the
type in the format string and the type of the arguments don't match. 

An error at compile-time is better, it gives the same advantages of static
typing, and it allows the programmer to catch format string bugs before
runtime, in all the program, even in parts of the code that aren't run yet (a
runtime bugs happens only with a specific writefln comes into the thread of
code being run).

Currently (dmd v2.047) this compiles with no errors:


import std.stdio: writefln;
void main() {
    float f = 10.5;
    writefln("%d", f);
}


But I'd like an error similar to:

test.d(4): Error: writefln format string type error, used format '%d' but
argument 'f' is of type float

Once written, this new testing routine can be useful for other functions too,
for the format(), some I/O functions, and for C functions like printf() too
that sometimes are present in D programs.


This is a similar C program:

#include "stdio.h"
int main() {
    float f = 10.5;
    printf("%d\n", f);
    return 0;
}



If compiled with GCC 4.5:
gcc -Wall testc.c -o testc

It outputs at compile-time:
test.c: In function 'main':
test.c:4:5: warning: format '%d' expects type 'int', but argument 2 has type
'double'


GCC 4.5 is not able to spot the bug in this program:

#include "stdio.h"
int main() {
    float f = 10.5;
    const char* format = "%d\n";
    printf(format, f);
    return 0;
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 14 2010
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458



--- Comment #1 from bearophile_hugs eml.cc 2010-09-23 15:34:27 PDT ---
See also bug 4927

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 23 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458



--- Comment #2 from bearophile_hugs eml.cc 2011-02-28 04:40:16 PST ---
Among the SAL (Microsoft's standard source code annotation language) there is
__format_string too, it denotes strings that contains % markers in the style of
printf:
http://msdn.microsoft.com/en-us/library/ms235402%28VS.80%29.aspx

In D it may be useful to catch bugs like this:

 format_string string f = "%d";
writeln(f, 10);

n most cases you don't want to print a format string. The compiler may show a
warning, saying that you are using a format string as first argument of a
printing function that doesn't use a format string. This warning helps against
that bug.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 28 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458


Denis Derman <denis.spir gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |denis.spir gmail.com


--- Comment #3 from Denis Derman <denis.spir gmail.com> 2011-02-28 05:45:19 PST
---
(In reply to comment #0)
 In some situations the format string of writef/writefln is not known at
 compile-time, but it most situations it is. So in the frequent cases where the
 format string is known at compile-time I'd like an error at compile-time if the
 type in the format string and the type of the arguments don't match. 
 
 An error at compile-time is better, it gives the same advantages of static
 typing, and it allows the programmer to catch format string bugs before
 runtime, in all the program, even in parts of the code that aren't run yet (a
 runtime bugs happens only with a specific writefln comes into the thread of
 code being run).
 
 Currently (dmd v2.047) this compiles with no errors:
 
 
 import std.stdio: writefln;
 void main() {
     float f = 10.5;
     writefln("%d", f);
 }
 
 
 But I'd like an error similar to:
 
 test.d(4): Error: writefln format string type error, used format '%d' but
 argument 'f' is of type float
 
 Once written, this new testing routine can be useful for other functions too,
 for the format(), some I/O functions, and for C functions like printf() too
 that sometimes are present in D programs.
Most commonly, format string static checking would catch arg count mismatch at compile-time: writefln("%s --> %s", a); compiles and builds happily; and launches at runtime: std.format.FormatError: std.format Orphan format specifier: %%s --> %s (Weird error message, in fact.) Denis -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 28 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458


Don <clugdbug yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |clugdbug yahoo.com.au


--- Comment #4 from Don <clugdbug yahoo.com.au> 2011-02-28 08:26:05 PST ---
Note that because of typesafe varags, this is a *much* smaller problem for D,
than it is for C or C++. Your test case generates a run-time exception. In C,
it silently fails, and it can be quite difficult to realize that the reason the
printed results are different to what you expected, is because there was an
error in the format string.

Personally I've wasted an enormous amount of time on this bug in C++ (and when
I use printf in D), but I've never had any problems with writefln or format.
I think it's a printf-specific bug.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 28 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458



--- Comment #5 from bearophile_hugs eml.cc 2011-02-28 09:45:16 PST ---
Once present this compile-time tests are applied on the format strings of
printf and related functions too, not just for writef/writefln.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 28 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |WONTFIX


--- Comment #6 from Walter Bright <bugzilla digitalmars.com> 2011-02-28
10:14:50 PST ---
(In reply to comment #5)
 Once present this compile-time tests are applied on the format strings of
 printf and related functions too, not just for writef/writefln.
D's method of fixing printf is to use writef instead. C's printf method has serious problems because when there is a mismatch between the format and the args, you get garbage results at runtime and even crashes. Hence the motivation for a format checker. But in D, writef is typesafe. Granted, the error will get caught at runtime rather than compile time, but it is effective. Furthermore, the incidence of such errors is much reduced for D. For example, quick, what's the right printf format specifier for size_t? No problem for writef, just use %s for any type. Your argument about issuing an error when trying to use use %d for a float and should be replaced with %f is missing the point of how writef works. With writef, you do NOT use a format specific to the type. Just use %s. The only time another format would even be used is if something more specific was needed, such as rounding or x many digits past the decimal point, or printing the value in hex notation. In summary, 1. fixing calls to printf falls outside of the scope of D. C is C, and D deliberately does not attempt to fix C APIs. (Part of the reason for that is because fixing C APIs means writing all our own documentation for those APIs.) 2. the big problems printf has are simply not there with writef. The remaining issue (compile time vs runtime error) is small enough to be not worth a language feature customized for it 2. such a feature would lockstep the compiler to a specific library implementation of writef, which is outside of the scope of the core language. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 28 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458



--- Comment #7 from bearophile_hugs eml.cc 2011-02-28 10:36:27 PST ---
 1. fixing calls to printf falls outside of the scope of D.
I use printf often in D because writeln is template-heavy (see the size of the compilation or the asm code produced by a hello world with writeln), printf is necessary if you want to debug writeln or you want to see if it's buggy, and because (as I have shown in past) printf is usually much faster than writeln when I have to print lot.
 2. the big problems printf has are simply not there with writef. The remaining
issue (compile time vs runtime error) is small enough to be not worth a language feature customized for it I don't agree it's small enough. The problem is that languages like Python are dynamic, when you run them you have the source code, etc. In D you may have just an already compiled binary. Being format strings dynamically typed you have the worst of both worlds: the downsides of dynamic typing (unexpected errors at run-time) with the need to compile and maybe even unavailability of source code. Most format strings are known at compile-time, so the compiler is supposed to catch such errors at compile time. GCC and Clang do it, in most cases.
 2. such a feature would lockstep the compiler to a specific library
implementation of writef, which is outside of the scope of the core language. A way to solve this is to introduce pluggable static systems, that means some functions written in D in Phobos that perform the compile-time tests on the first argument of certain write/printf functions, if such value is known at compile-time. A temporary work-around (that doesn't work for printf is to rename functions like writefln into longer names, and replace writef/writefln with two template functions that takes the format string as first template argument, perform the compile-time tests and then (to reduce template bloat) call the normal writefln that now have longer names. But this causes some template bloat that's absent with the pluggable type systems solution. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 28 2011
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4458


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com


--- Comment #8 from Jonathan M Davis <jmdavisProg gmx.com> 2011-02-28 11:00:31
PST ---
_Any_ output to the console is going to be inefficent. And the benefits of
writeln _far_ outweigh any possible degredation in performance in comparison to
writeln. And personally, I have _rarely_ had problems with writeln. I just use
%s all of the time, and it works. If you actually need to use a format
specifier other than %s, it's not exactly strenuous to be required to be
careful about it - especially if it's a rare event. You're still no worse of
than you are with printf. I really think that this is a non-issue.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 28 2011