www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - About Format String Attack for D's *writef*()

reply is91042 <xrwang cs.NCTU.edu.tw> writes:
Format string attacks are a problem of C's printf().

Consider the following C code:

	char name[100];
	printf("Please input your name: ");
	scanf("%s", name);
	printf(name);

If a user type "John", he will get:

	Please Input your name: John
	John

However, he may type "John%sWang" and get:

	Please Input your name: John%sWang
	JohnJohn%sWangWang (This may depend on the runtime environment.)

These bugs may be prevented by saying to the programmers:
"Don't let the input of a user be the first argument of printf()"
because printf() only interprets the first parameter as a format string.
But we can't just do the same thing to prevent the bugs for *writef*()
because of the power of *writef*().

The problem is *writef*() can interpret not only the first but also many
parameters as format strings.
Consider the following code.

	int x=123, y=321;
	writefln("This is a test: %s. ", x,
		"And this is another test: %s.", y);
	writefln("This is a test: %s.",
			"And this is another test: %s.", x, y);

And the output will be:

	This is a test: 123. And this is another test: 321.
	This is a test: And this is another test: %s..123321

It shows that *writef*() interpret any string as a format string if it way
not assigned by any other format strings.

Consider the following code.

	char[] user_name;
	writefln("Please Input your name: ");
	din.readf("%s", &user_name);
	writefln("Your name is ", user_name, ". And my name is Peter.");

If a user type "John", he will get:

	Please Input your name:
	John
	Your name is John. And my name is Peter.

However, he may type "John%sWang":

	Please Input your name:
	John%sWang
	Your name is John. And my name is Peter.Wang

Its behavior is so strange and is not what we expected.

Although we can use the same approach that we requires the programmers
put an argument "%s" before every string affected by users, I think it
is not a good privacy because it requires an extra heavy load for
programmers and loses the convenience of that *writef* can treat many
arguments as format strings.

So, I suggest a solution: Add a new type 'fstring' as the meaning
"format string" and *writef*() will do different thing for fstrings
and strings. If a string is encountered, they dump the string.  If a
fstring is encountered, they do the same thing as before.

Moreover, for easily creating a fstring, we can use f" and ".
For example:

	writefln(f"Your name is %s", user_name, ". And my name is Peter.");
Oct 05 2006
next sibling parent Derek Parnell <derek nomail.afraid.org> writes:
On Thu, 5 Oct 2006 07:01:30 +0000 (UTC), is91042 wrote:


 The problem is *writef*() can interpret not only the first but also many
 parameters as format strings.
Agreed. The way I handle this is to only use the first parameter to specify the formatting tokens, and to specify one for each subsequent parameter. Another is to make safe any user entered data. For example: import std.stdio; import std.cstream; import std.string; // Replace all occurrences of '%' with '%%' char[] safe(char[] a) { int i; int j; j = 0; while(j < a.length) { i = std.string.find(a[j..$], '%'); if (i < 0) break; i += j; a = a[0..i] ~ "%" ~ a[i..$]; j = i + 2; } return a; } void main() { char[] user_name; writefln("Please Input your name: "); din.readf("%s", &user_name); // Safer writefln("A,Your name is ", safe(user_name), ". And my name is Peter."); // My preference writefln("B,Your name is %s. And my name is Peter.", user_name); } -- Derek (skype: derek.j.parnell) Melbourne, Australia "Down with mediocrity!" 5/10/2006 6:01:41 PM
Oct 05 2006
prev sibling next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
is91042 wrote:

 The problem is *writef*() can interpret not only the first but also many
 parameters as format strings.
This is a feature, not a bug...
 It shows that *writef*() interpret any string as a format string if it way
 not assigned by any other format strings.
 
 Consider the following code.
 
 	char[] user_name;
 	writefln("Please Input your name: ");
 	din.readf("%s", &user_name);
 	writefln("Your name is ", user_name, ". And my name is Peter.");
This is the expected behaviour with writef, need to use "%s". You get the same with printf, if you concatenate the strings. Which is why I think using printf (in C) and writef (in D) *by default* isn't very nice to newcomers, as it is harder... There should be a simple function that just outputs a string.
 Its behavior is so strange and is not what we expected.
You get the same "odd" behaviour in: writef("100% unexpected"); (need to escape % by using %%, when you specify a format string)
 Although we can use the same approach that we requires the programmers
 put an argument "%s" before every string affected by users, I think it
 is not a good privacy because it requires an extra heavy load for
 programmers and loses the convenience of that *writef* can treat many
 arguments as format strings.
 
 So, I suggest a solution: Add a new type 'fstring' as the meaning
 "format string" and *writef*() will do different thing for fstrings
 and strings. If a string is encountered, they dump the string.  If a
 fstring is encountered, they do the same thing as before.
My suggestion was to instead add a "write" function, that would not interpret the format character '%' but just output the string as-is ? writeln("100% easier"); writeln("Your name is ", user_name, ". And my name is Peter."); See http://www.digitalmars.com/d/archives/digitalmars/D/21692.html and http://www.digitalmars.com/d/archives/digitalmars/D/15627.html --anders
Oct 05 2006
parent Lionello Lunesu <lio lunesu.remove.com> writes:
Anders F Björklund wrote:
 is91042 wrote:
 
 The problem is *writef*() can interpret not only the first but also many
 parameters as format strings.
This is a feature, not a bug...
 It shows that *writef*() interpret any string as a format string if it 
 way
 not assigned by any other format strings.

 Consider the following code.

     char[] user_name;
     writefln("Please Input your name: ");
     din.readf("%s", &user_name);
     writefln("Your name is ", user_name, ". And my name is Peter.");
This is the expected behaviour with writef, need to use "%s". You get the same with printf, if you concatenate the strings. Which is why I think using printf (in C) and writef (in D) *by default* isn't very nice to newcomers, as it is harder... There should be a simple function that just outputs a string.
 Its behavior is so strange and is not what we expected.
You get the same "odd" behaviour in: writef("100% unexpected"); (need to escape % by using %%, when you specify a format string)
 Although we can use the same approach that we requires the programmers
 put an argument "%s" before every string affected by users, I think it
 is not a good privacy because it requires an extra heavy load for
 programmers and loses the convenience of that *writef* can treat many
 arguments as format strings.

 So, I suggest a solution: Add a new type 'fstring' as the meaning
 "format string" and *writef*() will do different thing for fstrings
 and strings. If a string is encountered, they dump the string.  If a
 fstring is encountered, they do the same thing as before.
My suggestion was to instead add a "write" function, that would not interpret the format character '%' but just output the string as-is ?
Good idea.
Oct 05 2006
prev sibling parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
is91042 wrote:

 Consider the following code.
 
 	char[] user_name;
 	writefln("Please Input your name: ");
 	din.readf("%s", &user_name);
 	writefln("Your name is ", user_name, ". And my name is Peter.");
BTW; "din" does not work in GDC on the Mac: (i.e. std.stream.readf doesn't, actually...) Please Input your name: Anders Your name is . And my name is Peter. This is because there is no portable D standard for how "typeid comparison" is supposed to work ? In DMD, one typeid === another. In GDC, only ==. (meaning that "arguments[j] is typeid()" breaks) And I think that readf should go in std.stdio... (along with freadf, and also std.string.unformat) http://www.digitalmars.com/d/archives/digitalmars/D/11021.html --anders
Oct 05 2006