www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Buffered Files & Associative Arrays

reply Michael <mcoupland gmail.com> writes:
Greetings all!

When I compile and run the below program with a sample input test.txt file, I
get some very strange behavior. It behaves like a problem with strange strings
coming from a BufferedFile that for some reason the associative array can't
handle.

With test.txt containing three one-character lines:
	a
	b
	c

...I get the output:
	a 1 
	b 2 b 2 
	c 3 c 3 c 3 

...rather than the expected:
	a 1 
	a 1 b 2 
	a 1 b 2 c 3 

With test.txt containing longer strings:
	first
	second
	third

...the program crashes entirely with the following output:
	first 1
	Error: ArrayBoundsError TestArray(15)


However, if I replace the two relevant lines with the following:
	string[] file = ["first","second","third"]; // or ["a","b","c"]
	foreach( int n, string line; file )

...then the program runs as expected. But what's the difference?? Adding
newlines to the string constants above doesn't do any harm, which was what I
had first suspected as the culprit.

I don't think I'm missing anything obvious; can someone please confirm I'm not
crazy?

Thanks!
	Michael

--------------------------------------------------------------

import std.stdio;
import std.stream;

int main( char[][] args )
{
	int[string] Ar;
	
	Stream file = new BufferedFile("test.txt");
	
	foreach( ulong n, string line; file )
	{
		Ar[line] = n;
		
		foreach( string k; Ar.keys )
			writef("%s %d ", k, Ar[k] );

		writefln("");
	}

	return 0;
}
Jan 22 2008
next sibling parent reply "Unknown W. Brackets" <unknown simplemachines.org> writes:
Well, this still happens for "File", so it's not as if it's a 
BufferedFile issue.

As it happens, the problem is the way you are abusing File's buffer. 
You're taking the line, and using it... where the stream is overwriting 
that space with new data.

Find:

Ar[line] = n;

Replace:

Ar[line.dup] = n;

That should solve your problems.

-[Unknown]


Michael wrote:
 Greetings all!
 
 When I compile and run the below program with a sample input test.txt file, I
get some very strange behavior. It behaves like a problem with strange strings
coming from a BufferedFile that for some reason the associative array can't
handle.
 
 With test.txt containing three one-character lines:
 	a
 	b
 	c
 
 ...I get the output:
 	a 1 
 	b 2 b 2 
 	c 3 c 3 c 3 
 
 ...rather than the expected:
 	a 1 
 	a 1 b 2 
 	a 1 b 2 c 3 
 
 With test.txt containing longer strings:
 	first
 	second
 	third
 
 ...the program crashes entirely with the following output:
 	first 1
 	Error: ArrayBoundsError TestArray(15)
 
 
 However, if I replace the two relevant lines with the following:
 	string[] file = ["first","second","third"]; // or ["a","b","c"]
 	foreach( int n, string line; file )
 
 ...then the program runs as expected. But what's the difference?? Adding
newlines to the string constants above doesn't do any harm, which was what I
had first suspected as the culprit.
 
 I don't think I'm missing anything obvious; can someone please confirm I'm not
crazy?
 
 Thanks!
 	Michael
 
 --------------------------------------------------------------
 
 import std.stdio;
 import std.stream;
 
 int main( char[][] args )
 {
 	int[string] Ar;
 	
 	Stream file = new BufferedFile("test.txt");
 	
 	foreach( ulong n, string line; file )
 	{
 		Ar[line] = n;
 		
 		foreach( string k; Ar.keys )
 			writef("%s %d ", k, Ar[k] );
 
 		writefln("");
 	}
 
 	return 0;
 }
 

Jan 22 2008
parent bearophile <bearophileHUGS lycos.com> writes:
Unknown W. Brackets:
 As it happens, the problem is the way you are abusing File's buffer. 
 You're taking the line, and using it... where the stream is overwriting 
 that space with new data.

Yes, D is rather unsafe in that regard. To avoid this kind of bugs I add a "bool copy=true" as a template parameter (constant at compile time) to all my classes that return iterable objects then manage lot of data. So by default they perform the copy, and you avoid that whole class of bugs. When you know what you are doing and you want to go faster (sometimes 10 times faster) accepting a bit less safe code, you set that copy flag to false, and it keeps using the same buffer. I think the Phobos can grow such extra parameter in its iterable objects to avoid such kind of bugs. Bye, bearophile
Jan 22 2008
prev sibling parent Gide Nwawudu <gide btinternet.com> writes:
On Tue, 22 Jan 2008 03:35:01 -0500, Michael <mcoupland gmail.com>
wrote:

Greetings all!

When I compile and run the below program with a sample input test.txt file, I
get some very strange behavior. It behaves like a problem with strange strings
coming from a BufferedFile that for some reason the associative array can't
handle.

With test.txt containing three one-character lines:
	a
	b
	c

...I get the output:
	a 1 
	b 2 b 2 
	c 3 c 3 c 3 

...rather than the expected:
	a 1 
	a 1 b 2 
	a 1 b 2 c 3 

With test.txt containing longer strings:
	first
	second
	third

...the program crashes entirely with the following output:
	first 1
	Error: ArrayBoundsError TestArray(15)


However, if I replace the two relevant lines with the following:
	string[] file = ["first","second","third"]; // or ["a","b","c"]
	foreach( int n, string line; file )

...then the program runs as expected. But what's the difference?? Adding
newlines to the string constants above doesn't do any harm, which was what I
had first suspected as the culprit.

I don't think I'm missing anything obvious; can someone please confirm I'm not
crazy?

Thanks!
	Michael

--------------------------------------------------------------

import std.stdio;
import std.stream;

int main( char[][] args )
{
	int[string] Ar;
	
	Stream file = new BufferedFile("test.txt");
	
	foreach( ulong n, string line; file )
	{
		Ar[line] = n;
		
		foreach( string k; Ar.keys )
			writef("%s %d ", k, Ar[k] );

		writefln("");
	}

	return 0;
}

Without D2's const/invariant enhancements it is very easy introduce this bug. FWIW your code does not compile on D2. The following code produces the correct output. import std.stdio; import std.stream; int main( char[][] args ) { int[string] Ar; Stream file = new BufferedFile("test.txt"); foreach( ulong n, char[] line; file ) // mutable line variable { Ar[line.idup] = n; // idup needed foreach( string k; Ar.keys ) writef("%s %d ", k, Ar[k] ); writefln(""); } return 0; } Gide
Jan 23 2008