www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Error reading char in stdio.LockingTextReader.takeFront()

reply "MGW" <mgw yandex.ru> writes:
I think that found an error in the stdio module.
When reading the file consisting of entries in the coding UTF-8, 
type:

.......
16235;Иванов;Петр;Петрович;17.09.1961
8765;Петров;Иван;Васильевич;25.12.1978
.......

total quantity of records about 30000

there are errors: invalid secuence Utf-8

Research of a problem showed that results failure from reading 
from the buffer of the next character the FGETC function 
(cast(_iobuf*)_f. _ p.handle) in the takeFront procedure. And, 
over time failure happens only provided that characters return to 
the (cast(_iobuf*)_f. _ p.handle) the ungetc() function. If it is 
simple to realize reading characters, without returning them the 
ungetc function(), this error isn't shown. For example it was 
absent in the dmd 2.066 version where in stdio.d the ungetc 
function wasn't used().
Jul 28 2015
parent reply "MGW" <mgw yandex.ru> writes:
Error in std.stdio.d
This example doesn't work!

// dmd 2.067.1 Win 32
import std.stdio;

void main(string[] args) {
	File fw = File("panic.csv", "w");
	for(int i; i != 5000; i++) {
		fw.writeln(i, ";", "Иванов;Пётр;Иванович");
	}
	fw.close();
	// Test read
	File fr = File("panic.csv", "r");
	int nom; string fam, nam, ot;
	// Error format read
	while(!fr.eof) fr.readf("%s;%s;%s;%s\n", &nom, &fam, &nam, &ot);
}

This mistake, is result of wrong algorithm with reading from the 
ring buffer of the file and return of the read symbols there in 
stdio function ungetc().
Aug 01 2015
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Sunday, 2 August 2015 at 06:01:01 UTC, MGW wrote:
 Error in std.stdio.d
 This example doesn't work!

 // dmd 2.067.1 Win 32
 import std.stdio;

 void main(string[] args) {
 	File fw = File("panic.csv", "w");
 	for(int i; i != 5000; i++) {
 		fw.writeln(i, ";", "Иванов;Пётр;Иванович");
 	}
 	fw.close();
 	// Test read
 	File fr = File("panic.csv", "r");
 	int nom; string fam, nam, ot;
 	// Error format read
 	while(!fr.eof) fr.readf("%s;%s;%s;%s\n", &nom, &fam, &nam, 
 &ot);
 }

 This mistake, is result of wrong algorithm with reading from 
 the ring buffer of the file and return of the read symbols 
 there in stdio function ungetc().
Please report the issue at https://issues.dlang.org if you have not already. - Jonathan M Davis
Aug 01 2015