www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - The difference in string and char[], readf() and scanf()

reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
Hi,
Tell me, please, why this code works correctly always:

import std.stdio;

int n;
readf("%s\n", &n);

string s, t;
readf("%s\n%s\n", &s, &t);

And this code works correctly is not always:

import std.stdio;

readf("%s\n", &n);

char[200010] s, t;
scanf("%s%s", s.ptr, t.ptr);

Data is entered only in this format (n - the length of strings 
str1 and str2, 1 <= n <= 200000; only lowercase letters):

n
str1
str2

For example:

5
cater
doger
Mar 21 2015
parent reply "anonymous" <anonymous example.com> writes:
On Saturday, 21 March 2015 at 08:37:59 UTC, Dennis Ritchie wrote:
 Tell me, please, why this code works correctly always:
[...]
 And this code works correctly is not always:

 import std.stdio;

 readf("%s\n", &n);

 char[200010] s, t;
 scanf("%s%s", s.ptr, t.ptr);
Please go into more detail about how it doesn't work.
Mar 21 2015
parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Saturday, 21 March 2015 at 12:08:05 UTC, anonymous wrote:
 Please go into more detail about how it doesn't work.
Task: http://codeforces.com/contest/527/problem/B?locale=en It works: char[200010] s, t; s = readln.strip; t = readln.strip; http://codeforces.com/contest/527/submission/10377392?locale=en It doesn't always work: char[200010] s, t; scanf("%s%s", s.ptr, t.ptr); http://codeforces.com/contest/527/submission/10376852?locale=en P.S. I can't copy test №23 completely.
Mar 21 2015
parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
In C++ it is fully working:

char s[200005], t[200005];
scanf("%s%s", s, t);

http://codeforces.com/contest/527/submission/10376381?locale=en
Mar 21 2015
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
On Saturday, 21 March 2015 at 14:31:20 UTC, Dennis Ritchie wrote:
 In C++ it is fully working:

 char s[200005], t[200005];
 scanf("%s%s", s, t);
Indeed. Generate a 100000-character string: ----- import std.range, std.stdio; void main () {'a'.repeat (100000).writeln;} ----- Try to copy it with D scanf and printf: ----- import std.stdio; void main () { char [100000] a; scanf ("%s", a.ptr); printf ("%s\n", a.ptr); } ----- Only 32767 first characters of the string are actually copied.
Mar 21 2015
next sibling parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Saturday, 21 March 2015 at 15:05:56 UTC, Ivan Kazmenko wrote:
 On Saturday, 21 March 2015 at 14:31:20 UTC, Dennis Ritchie 
 wrote:
 In C++ it is fully working:

 char s[200005], t[200005];
 scanf("%s%s", s, t);
Indeed.
And why in D copied only the first 32767 characters of the string? I'm more days couldn't understand what was going on...
 Generate a 100000-character string:
 -----
 import std.range, std.stdio;
 void main () {'a'.repeat (100000).writeln;}
 -----

 Try to copy it with D scanf and printf:
 -----
 import std.stdio;
 void main () {
 	char [100000] a;
 	scanf ("%s", a.ptr);
 	printf ("%s\n", a.ptr);
 }
 -----

 Only 32767 first characters of the string are actually copied.
Thank you very much.
Mar 21 2015
parent reply "Ivan Kazmenko" <gassa mail.ru> writes:
On Saturday, 21 March 2015 at 16:34:44 UTC, Dennis Ritchie wrote:
 And why in D copied only the first 32767 characters of the 
 string? I'm more days couldn't understand what was going on...
To me, it looks like a bug somewhere, though I don't get where exactly. Is it in bits of DigitalMars C/C++ compiler code glued into druntime? Anyway, as for Codeforces problems, you mostly need to read text input as tokens separated by spaces and/or newlines. For that, D I/O is sufficient, there is no need to use legacy C++ I/O. Usually, readf(" %s", &v) works for every scalar type of variable v (including reals and 64-bit integers) except strings, and readln() does the thing for strings. Don't forget to get rid of the newline sequence on the previous line if you mix the two. Possible leading and trailing spaces in " %s " mean skipping all whitespace before or after the token, respectively, as is the case for scanf in C/C++. As far as I remember, for reading a line of numbers separated by spaces, ----- auto a = readln.split.map!(to!int).array; ----- is a bit faster than a loop of readf filling the array, but that hardly matters in the majority of problems. You can see my submissions (http://codeforces.com/submissions/Gassa) for example. If you really feel the need for I/O better suited for the specifics of algorithmic programming contests (as Java people almost always do in their language for some reason), look at Kazuhiro Hosaka's submissions (http://codeforces.com/submissions/hos.lyric). In case you want to go even further and write your own I/O layer for that, I'll point you to a recent discussion of text I/O methods here: http://stackoverflow.com/q/28922323/1488799 (see comments and answers). Ivan Kazmenko.
Mar 21 2015
next sibling parent "anonymous" <anonymous example.com> writes:
On Saturday, 21 March 2015 at 23:00:46 UTC, Ivan Kazmenko wrote:
 To me, it looks like a bug somewhere, though I don't get where 
 exactly.  Is it in bits of DigitalMars C/C++ compiler code 
 glued into druntime?
As far as I understand, the bug is in snn.lib's scanf. snn.lib is Digital Mars's implementation of the C standard library (aka C runtime library or just C runtime). By default, some version of the C runtime is linked into every D program, so that you (and phobos and druntime) can use it. snn.lib is used for Windows x86. For other targets, other implementations of the C runtime are used (which don't have that bug).
Mar 21 2015
prev sibling parent "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Saturday, 21 March 2015 at 23:00:46 UTC, Ivan Kazmenko wrote:
 On Saturday, 21 March 2015 at 16:34:44 UTC, Dennis Ritchie 
 wrote:
 And why in D copied only the first 32767 characters of the 
 string? I'm more days couldn't understand what was going on...
To me, it looks like a bug somewhere, though I don't get where exactly. Is it in bits of DigitalMars C/C++ compiler code glued into druntime? Anyway, as for Codeforces problems, you mostly need to read text input as tokens separated by spaces and/or newlines. For that, D I/O is sufficient, there is no need to use legacy C++ I/O. Usually, readf(" %s", &v) works for every scalar type of variable v (including reals and 64-bit integers) except strings, and readln() does the thing for strings. Don't forget to get rid of the newline sequence on the previous line if you mix the two. Possible leading and trailing spaces in " %s " mean skipping all whitespace before or after the token, respectively, as is the case for scanf in C/C++. As far as I remember, for reading a line of numbers separated by spaces, ----- auto a = readln.split.map!(to!int).array; ----- is a bit faster than a loop of readf filling the array, but that hardly matters in the majority of problems. You can see my submissions (http://codeforces.com/submissions/Gassa) for example. If you really feel the need for I/O better suited for the specifics of algorithmic programming contests (as Java people almost always do in their language for some reason), look at Kazuhiro Hosaka's submissions (http://codeforces.com/submissions/hos.lyric). In case you want to go even further and write your own I/O layer for that, I'll point you to a recent discussion of text I/O methods here: http://stackoverflow.com/q/28922323/1488799 (see comments and answers).
Thanks.
Mar 21 2015
prev sibling next sibling parent reply FG <home fgda.pl> writes:
On 2015-03-21 at 16:05, Ivan Kazmenko wrote:
 Generate a 100000-character string
[...]
 Try to copy it with D scanf and printf:
 -----
 import std.stdio;
 void main () {
      char [100000] a;
      scanf ("%s", a.ptr);
      printf ("%s\n", a.ptr);
 }
 -----

 Only 32767 first characters of the string are actually copied.
In what universe?! Which OS, compiler and architecture?
Mar 21 2015
parent reply "Dennis Ritchie" <dennis.ritchie mail.ru> writes:
On Saturday, 21 March 2015 at 19:09:59 UTC, FG wrote:
 In what universe?! Which OS, compiler and architecture?
On Saturday, 21 March 2015 at 19:09:59 UTC, FG wrote:
 In what universe?! Which OS, compiler and architecture?
Windows 8.1 x64, dmd 2.066.1: import std.range, std.stdio; void main () { stdout = File("in.txt", "w"); 'a'.repeat(100000).writeln; } import std.stdio; import std.cstream; void main () { freopen("in.txt", "r", din.file); freopen("out.txt", "w", dout.file); char[100000] a; scanf("%s", a.ptr); int lenA; foreach (i; 0 .. 100000) { if (a[i] == 'a') ++lenA; printf("%c", a[i]); } printf("\n%d\n", lenA); // 32767 } By the way, in Ubuntu 14.04 LTS (dmd 2.066.1) everything works fine: import std.range, std.stdio; void main () { stdout = File("in.txt", "w"); 'a'.repeat(100000).writeln; } import std.stdio; import std.cstream; void main () { freopen("in.txt", "r", din.file); freopen("out.txt", "w", dout.file); char[100000] a; scanf("%s", a.ptr); int lenA; foreach (i; 0 .. 100000) { if (a[i] == 'a') ++lenA; printf("%c", a[i]); } printf("\n%d\n", lenA); // 100000 }
Mar 21 2015
parent reply FG <home fgda.pl> writes:
On 2015-03-21 at 21:02, Dennis Ritchie wrote:
 In what universe?! Which OS, compiler and architecture?
Windows 8.1 x64, dmd 2.066.1:
That's strange. I cannot recreate the problem on Win7 x64 with dmd 2.066.1, neither when compiled for 32- nor 64-bit. I have saved the a's to a file and use input redirect to load it, while the program is as follows: import std.stdio; void main () { char [100000] a; scanf ("%s", a.ptr); printf ("%s\n", a.ptr); }
       freopen("in.txt", "r", din.file);
No, that approach didn't change the result. I still get 10000.
Mar 21 2015
parent FG <home fgda.pl> writes:
On 2015-03-21 at 22:15, FG wrote:
 On 2015-03-21 at 21:02, Dennis Ritchie wrote:
 In what universe?! Which OS, compiler and architecture?
Windows 8.1 x64, dmd 2.066.1:
That's strange. I cannot recreate the problem on Win7 x64 with dmd 2.066.1, neither when compiled for 32- nor 64-bit. I have saved the a's to a file and use input redirect to load it, while the program is as follows:
Oh, wait. I was wrong. I have the same problem. It didn't appear before because the file of A's that I used didn't have a \r\n EOL at the end. With those two bytes added it failed. It's the EOL at the end of the input word that's the problem. I tested four different inputs: aaa...aaa OK aaa...aaa\r\n FAIL aaa...aaa bbb OK aaa...aaa bbb\r\n OK
Mar 21 2015
prev sibling parent "anonymous" <anonymous example.com> writes:
On Saturday, 21 March 2015 at 15:05:56 UTC, Ivan Kazmenko wrote:
 Generate a 100000-character string:
 -----
 import std.range, std.stdio;
 void main () {'a'.repeat (100000).writeln;}
 -----

 Try to copy it with D scanf and printf:
 -----
 import std.stdio;
 void main () {
 	char [100000] a;
 	scanf ("%s", a.ptr);
 	printf ("%s\n", a.ptr);
 }
 -----

 Only 32767 first characters of the string are actually copied.
That doesn't happen on linux, but I could reproduce it in wine. Seems to be a bug in the C runtime (snn.lib). I filed an issue: https://issues.dlang.org/show_bug.cgi?id=14315
Mar 21 2015