www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Read text file, line by line?

reply AEon <AEon_member pathlink.com> writes:
It seems *very* easy to open a text file in D, see below code (snipped from the
wordcount example from the documentation).

<code>
   import std.file;
   // Open/read complete file into a D string! Nice.	
   char[] input;
   input = cast(char[])std.file.read(arg);		
</code>

I checked the other command in std.file. AFAICT there seems to be no way to open
a file handle, and then read *line by line* from a e.g. config file?

Does one need to parse the file as one char[] array block, and do all the line
by line checking by hand?

AEon
Mar 19 2005
next sibling parent reply Martin Boeker <martin.boeker uniklinik-freiburg.de> writes:
For example, you could do something like this:

import std.stream;

void readfile(char[] fn){
	File f = new File();
	char[] l;
	f.open(fn);
	while(!f.eof()) {
		l = f.readLine();
		printf("line: %.*s\n", l);
	}
	f.close();
}

Martin

AEon wrote:
 It seems *very* easy to open a text file in D, see below code (snipped from the
 wordcount example from the documentation).
 
 <code>
    import std.file;
    // Open/read complete file into a D string! Nice.	
    char[] input;
    input = cast(char[])std.file.read(arg);		
 </code>
 
 I checked the other command in std.file. AFAICT there seems to be no way to
open
 a file handle, and then read *line by line* from a e.g. config file?
 
 Does one need to parse the file as one char[] array block, and do all the line
 by line checking by hand?
 
 AEon

Mar 19 2005
parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 20 Mar 2005 00:46:10 +0100, Martin Boeker wrote:

 For example, you could do something like this:
 
 import std.stream;
 
 void readfile(char[] fn){
 	File f = new File();
 	char[] l;
 	f.open(fn);
 	while(!f.eof()) {
 		l = f.readLine();
 		printf("line: %.*s\n", l);
 	}
 	f.close();
 }
 
 Martin
 
 AEon wrote:
 It seems *very* easy to open a text file in D, see below code (snipped from the
 wordcount example from the documentation).
 
 <code>
    import std.file;
    // Open/read complete file into a D string! Nice.	
    char[] input;
    input = cast(char[])std.file.read(arg);		
 </code>
 
 I checked the other command in std.file. AFAICT there seems to be no way to
open
 a file handle, and then read *line by line* from a e.g. config file?
 
 Does one need to parse the file as one char[] array block, and do all the line
 by line checking by hand?
 
 AEon


Here is the code I used in the 'Build' utility... // Read an entire file into a string. char[] GetFileText(char[] pFileName) { char[] lFileText; if (! std.file.exists( pFileName)) { return ""; } else { lFileText = cast(char[]) std.file.read(pFileName); // Ensure it ends with a EOL marker. if ( (lFileText.length == 0) || (lFileText[$-1] != '\n')) lFileText ~= std.path.linesep; return lFileText; } } // Read a entire file in to a set of lines (strings). char[][] GetFileTextLines(char[] pFileName) { char[][] lLines; char[] lText; char[] lDelim; if (! std.file.exists(pFileName)) { throw new Exception( std.string.format("File '%s' not found.", pFileName)); } else { lText = GetFileText(pFileName); // Set the EOL marker based on ones already used in the file. if (std.string.find(lText, "\r\n") != -1 ) lDelim = "\r\n"; else lDelim = "\n"; lLines = split( lText, lDelim); return lLines; } } -- Derek Parnell Melbourne, Australia 20/03/2005 11:19:46 AM
Mar 19 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message
news:1p89hc3q3x38p$.18njvr6lvkb5x.dlg 40tude.net...
 Here is the code I used in the 'Build' utility...

The common line ending conventions are: \n unix \r\n windows \r mac so I suggest using std.string.splitLines() instead.
Mar 19 2005
parent reply Derek Parnell <derek psych.ward> writes:
On Sat, 19 Mar 2005 16:43:54 -0800, Walter wrote:

 "Derek Parnell" <derek psych.ward> wrote in message
 news:1p89hc3q3x38p$.18njvr6lvkb5x.dlg 40tude.net...
 Here is the code I used in the 'Build' utility...

The common line ending conventions are: \n unix \r\n windows \r mac so I suggest using std.string.splitLines() instead.

I normally would agree, however the word "convention" was mentioned. ;-) Initially, this is how I wrote it the routines, that is, I assumed the conventions. However, what I discovered was that it is quite possible to have a Unix-EOL file in a Windows environment (eg. One gotten via ftp from a Unix system, or one saved by your Windows editor as a 'Unix' file.) So, I instead opted to examine what the file was actually using in itself, before trying to split it into lines. -- Derek Parnell Melbourne, Australia 20/03/2005 1:10:52 PM
Mar 19 2005
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sun, 20 Mar 2005 13:14:33 +1100, Derek Parnell wrote:

 On Sat, 19 Mar 2005 16:43:54 -0800, Walter wrote:
 
 "Derek Parnell" <derek psych.ward> wrote in message
 news:1p89hc3q3x38p$.18njvr6lvkb5x.dlg 40tude.net...
 Here is the code I used in the 'Build' utility...

The common line ending conventions are: \n unix \r\n windows \r mac so I suggest using std.string.splitLines() instead.

I normally would agree, however the word "convention" was mentioned. ;-) Initially, this is how I wrote it the routines, that is, I assumed the conventions. However, what I discovered was that it is quite possible to have a Unix-EOL file in a Windows environment (eg. One gotten via ftp from a Unix system, or one saved by your Windows editor as a 'Unix' file.) So, I instead opted to examine what the file was actually using in itself, before trying to split it into lines.

(Oops. Left out the last line of my reply) So I no longer assume that because I'm running in a <whatever> environment that the file was *saved* using the <whatever> convention. -- Derek Parnell Melbourne, Australia 20/03/2005 1:18:16 PM
Mar 19 2005
parent J C Calvarese <jcc7 cox.net> writes:
Derek Parnell wrote:
 On Sun, 20 Mar 2005 13:14:33 +1100, Derek Parnell wrote:
 
 
On Sat, 19 Mar 2005 16:43:54 -0800, Walter wrote:


"Derek Parnell" <derek psych.ward> wrote in message
news:1p89hc3q3x38p$.18njvr6lvkb5x.dlg 40tude.net...

Here is the code I used in the 'Build' utility...

The common line ending conventions are: \n unix \r\n windows \r mac



 (Oops.  Left out the last line of my reply)
 
 So I no longer assume that because I'm running in a <whatever> environment
 that the file was *saved* using the <whatever> convention.

Also, I think most users appreciate it when a program doesn't explode just because they're trying to input a Unix text file on their Windows PC. It's those little things that make the difference. -- Justin (a/k/a jcc7) http://jcc_7.tripod.com/d/
Mar 19 2005
prev sibling parent AEon <AEon_member pathlink.com> writes:
Derek Parnell says...
 The common line ending conventions are:
 
     \n    unix
     \r\n    windows
     \r    mac
 
 so I suggest using std.string.splitLines() instead.

I normally would agree, however the word "convention" was mentioned. ;-) Initially, this is how I wrote it the routines, that is, I assumed the conventions. However, what I discovered was that it is quite possible to have a Unix-EOL file in a Windows environment (eg. One gotten via ftp from a Unix system, or one saved by your Windows editor as a 'Unix' file.) So, I instead opted to examine what the file was actually using in itself, before trying to split it into lines.

Actually the programming editor UltraEdit lets you do that was well. I.e. save files in Unix format. So in my really old code I checked for \r and \n, to ensure the parser would not hickup on log files. AEon
Mar 19 2005
prev sibling next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"AEon" <AEon_member pathlink.com> wrote in message
news:d1id3r$273p$1 digitaldaemon.com...
 Does one need to parse the file as one char[] array block, and do all the

 by line checking by hand?

You can use std.string.splitLines() to turn it into an array of lines.
Mar 19 2005
next sibling parent reply AEon <AEon_member pathlink.com> writes:
Walter says...

 Does one need to parse the file as one char[] array block, and do all the
 line by line checking by hand?

You can use std.string.splitLines() to turn it into an array of lines.

Aha... I compiled a .doc file from the latest online info, but the std.string lib was not there, so I missed those infos. Is the std.string lib info new? BTW: I started to port AEstats... the code is so clean, and neat, and simple to code, it makes my cry with happiness. Thanx for D... *snif*. ANSI C I will revisit never again!
The common line ending conventions are:

    \n    unix
    \r\n    windows
    \r    mac

so I suggest using std.string.splitLines() instead.

Found that out the hard way when coding AEstats under windows vs. linux. Does D have a EOL or so variable that would point to the right escape chars, depending on the OS? I noted that "import std.path;" has several vars like "sep" = "\" (windows) and probable "\" under linux. If something like eof does not exist, I am sure it would be very helpful, when trying to write portable code. Martin and Derek, thanx for those real world examples, will look into them. AEon
Mar 19 2005
parent reply J C Calvarese <jcc7 cox.net> writes:
AEon wrote:
 Walter says...
 
 
Does one need to parse the file as one char[] array block, and do all the
line by line checking by hand?

You can use std.string.splitLines() to turn it into an array of lines.

Aha... I compiled a .doc file from the latest online info, but the std.string lib was not there, so I missed those infos. Is the std.string lib info new?

You mean the material in http://www.digitalmars.com/d/std_string.html? It's been there for at least a couple years. I wouldn't call it new. ;) (It did used to be part of http://www.digitalmars.com/d/phobos.html, though. I guess it got long enough to warrant a separate page.)
 BTW: I started to port AEstats... the code is so clean, and neat, and simple to
 code, it makes my cry with happiness. Thanx for D... *snif*. ANSI C I will
 revisit never again!

Another C user converted. Yay!
The common line ending conventions are:

   \n    unix
   \r\n    windows
   \r    mac

so I suggest using std.string.splitLines() instead.

Found that out the hard way when coding AEstats under windows vs. linux.

 Does D have a EOL or so variable that would point to the right escape chars,
 depending on the OS?

Apparently, std.string doesn't have an equivalent function. I don't really see why you'd want it since we already have readLine and splitLines. It guess it wouldn't hurt to add such a function.
 
 I noted that "import std.path;" has several vars like "sep" = "\" (windows) and
 probable "\" under linux.
 
 If something like eof does not exist, I am sure it would be very helpful, when
 trying to write portable code.

(I think you mean to write "eol" instead of "eof" here.) I don't know the ins and outs of writing cross-platform code. How does it usually break down? Window: \r\n Linux: \n MacOS: \r Does using \r\n for EOL typically fail to work properly on Linux and MacOS? -- Justin (a/k/a jcc7) http://jcc_7.tripod.com/d/
Mar 19 2005
parent AEon <AEon_member pathlink.com> writes:
J C Calvarese says...

 Is the std.string lib info new?

You mean the material in http://www.digitalmars.com/d/std_string.html? It's been there for at least a couple years. I wouldn't call it new. ;) (It did used to be part of http://www.digitalmars.com/d/phobos.html, though. I guess it got long enough to warrant a separate page.)

I must have copy/pasted an old part. The links to that page (ironically) where in the doc :)... my jaw dropped when I saw all those functions... snif... this will make re-writing so much more easy. I noted regular expressions as well, possibly that would make all my tedious parsing, almost trivial... :)
Another C user converted. Yay!

I would have converted much earlier, hat I known about D. But I seem to be priviledged that most of D in "there" already.
The common line ending conventions are:

   \n    unix
   \r\n    windows
   \r    mac

so I suggest using std.string.splitLines() instead.

Found that out the hard way when coding AEstats under windows vs. linux.

 Does D have a EOL or so variable that would point to the right escape chars,
 depending on the OS?

Apparently, std.string doesn't have an equivalent function. I don't really see why you'd want it since we already have readLine and splitLines. It guess it wouldn't hurt to add such a function.

That was just a thought, I usually think in patterns, and when I seem something already implemented, and then something similar comes up, I take note :) Will look into those funtions.
 I noted that "import std.path;" has several vars like "sep" = "\" (windows) >>
and probable "\" under linux.
 
 If something like eof does not exist, I am sure it would be very helpful, >>
>> when trying to write portable code.

(I think you mean to write "eol" instead of "eof" here.) I don't know the ins and outs of writing cross-platform code. How does it usually break down?

Ops, yep EOL. AEon
Mar 19 2005
prev sibling next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Sat, 19 Mar 2005 15:44:28 -0800, Walter wrote:

 "AEon" <AEon_member pathlink.com> wrote in message
 news:d1id3r$273p$1 digitaldaemon.com...
 Does one need to parse the file as one char[] array block, and do all the

 by line checking by hand?

You can use std.string.splitLines() to turn it into an array of lines.

I've just had a look at std.string.splitlines and it does do what I need, in that it doesn't assume any particular line ending convention. Thanks Walter, I'll use that from now on. Sorry for ever doubting you ;-) -- Derek Melbourne, Australia 21/03/2005 11:34:38 AM
Mar 20 2005
parent "Walter" <newshound digitalmars.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message
news:i3yn7fqa6zex$.h2xheiurc8r5$.dlg 40tude.net...
 On Sat, 19 Mar 2005 15:44:28 -0800, Walter wrote:

 "AEon" <AEon_member pathlink.com> wrote in message
 news:d1id3r$273p$1 digitaldaemon.com...
 Does one need to parse the file as one char[] array block, and do all



 line
 by line checking by hand?

You can use std.string.splitLines() to turn it into an array of lines.

I've just had a look at std.string.splitlines and it does do what I need, in that it doesn't assume any particular line ending convention. Thanks Walter, I'll use that from now on. Sorry for ever doubting you ;-)

Line end parsing is one of those routine chores that everyone gets wrong, that's why it needs to be a standard library function. splitLines() is wrong, too, it needs to be fixed to recognize unicode LS and PS, but if everyone uses splitLines(), then they'll automatically get fixed as well!
Mar 20 2005
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
Walter wrote:
<snip>
 You can use std.string.splitLines() to turn it into an array of 
 lines.

Do you mean splitLines or splitlines? Guess it's another reason to follow conventions - it helps you to remember what you called stuff. I bet Sun is having quite a bit of trouble.... http://www.mindprod.com/jgloss/gotchas.html#INCONSISTENCIES Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Mar 21 2005
prev sibling parent reply David Medlock <noone nowhere.com> writes:
AEon wrote:
 It seems *very* easy to open a text file in D, see below code (snipped from the
 wordcount example from the documentation).
 
 <code>
    import std.file;
    // Open/read complete file into a D string! Nice.	
    char[] input;
    input = cast(char[])std.file.read(arg);		
 </code>
 
 I checked the other command in std.file. AFAICT there seems to be no way to
open
 a file handle, and then read *line by line* from a e.g. config file?
 
 Does one need to parse the file as one char[] array block, and do all the line
 by line checking by hand?
 
 AEon

Is this what you need? import std.stream; auto x = new File( filename, FileMode.In ); scope(exit) { x.close(); } foreach( char[] line; x ) writefln( line ); -DavidM
Mar 03 2007
parent David Medlock <noone nowhere.com> writes:
David Medlock wrote:
 AEon wrote:
 It seems *very* easy to open a text file in D, see below code (snipped 
 from the
 wordcount example from the documentation).

 <code>
    import std.file;
    // Open/read complete file into a D string! Nice.   
    char[] input;
    input = cast(char[])std.file.read(arg);       
 </code>

 I checked the other command in std.file. AFAICT there seems to be no 
 way to open
 a file handle, and then read *line by line* from a e.g. config file?

 Does one need to parse the file as one char[] array block, and do all 
 the line
 by line checking by hand?

 AEon

Is this what you need? import std.stream; auto x = new File( filename, FileMode.In ); scope(exit) { x.close(); } foreach( char[] line; x ) writefln( line ); -DavidM

Hehe.
Mar 03 2007