digitalmars.D - Why does readln include the line terminator?

Georg Wrede (9/9) Apr 13 2009 Readln returns a string which contains the line terminator.

Daniel Keep (7/20) Apr 14 2009 Because if it stripped it, there's no way to know what it was. If you

Walter Bright (8/11) Apr 14 2009 That's right; there are currently at least 6 different line terminators:

Georg Wrede (20/32) Apr 14 2009 So the programmer who wants to write portable code, has to implement

BCS (4/9) Apr 14 2009 Only if you considering wanting to maintain merge-ability/diff-ability a...

Georg Wrede (6/17) Apr 14 2009 Doesn't this kind of prove my point? Changing a line ending /should not/...

BCS (2/16) Apr 14 2009 I make no assertions about what should be, only what is.

bearophile (8/11) Apr 14 2009 You use a string function or string method that removes the eventually p...

Georg Wrede (2/13) Apr 14 2009 Your code ends up printing the output on every other line.

Andrei Alexandrescu (3/5) Apr 14 2009 25 years and no networking code?

Steven Schveighoffer (5/9) Apr 14 2009 Been writing code for about 12 years, lots and lots of networking code. ...

Sean Kelly (6/17) Apr 14 2009 With HTTP, for example, lines are terminated with \r\n. The lines

Georg Wrede (4/9) Apr 14 2009 I can see having to use one or another line ending in the whole output

Nick Sabalausky (5/8) Apr 14 2009 Source code with unescaped nl's/cr's embedded in a string literal? Thoug...

Andrei Alexandrescu (31/41) Apr 14 2009 I think there are a few concerns when designing an API for reading

Daniel Keep (7/20) Apr 14 2009 Why not:

Andrei Alexandrescu (3/26) Apr 14 2009 And how about when sep is elaborate (e.g. regex)?

Daniel Keep (6/44) Apr 14 2009 Whatever was matched. If we have a file containing:

Andrei Alexandrescu (3/52) Apr 14 2009 Where did you specify the separator in the call to byLine?

Steven Schveighoffer (25/71) Apr 14 2009 I think he's not read the docs. Consider this usage instead:

Christopher Wright (4/5) Apr 15 2009 Why specify anything at compile time when a user could reasonably

Robert Fraser (3/10) Apr 15 2009 Yes, and for maximum abstraction, the config file should be stored as

Christopher Wright (5/16) Apr 15 2009 I just really hate to see templates when a regular function would

Steven Schveighoffer (9/24) Apr 15 2009 It's just a demonstration of what the OP was talking about but wasn't

Georg Wrede (2/16) Apr 15 2009

Stewart Gordon (21/26) Apr 14 2009 But readln only stops on '\n' (or whatever character you tell it to

Christopher Wright (5/18) Apr 14 2009 By default, tango does not exhibit this behavior. If you wish, you can

Georg Wrede (12/32) Apr 14 2009 Now this is more like it. The default should really be (in Phobos too)

Manfred Nowak (5/7) Apr 14 2009 This is false in case of simple copying. And I doubt, that for more

Denis Koroskin (2/9) Apr 14 2009 Tango does the best by having an optional parameter that denotes whether...
Georg Wrede (6/14) Apr 14 2009 For copying there is the operating system command, copy.

Manfred Nowak (3/4) Apr 14 2009 Agreed.

Kagamin (2/13) Apr 15 2009 I think, only (d) is important, all others are *strange* things. I usual...

Stewart Gordon (5/21) Apr 15 2009 So you expect text editors to discard both kinds of information?

Kagamin (2/3) Apr 16 2009 No. Text editor is a *specialized* text processing tool and it usually u...

Georg Wrede <georg.wrede iki.fi> writes:

Readln returns a string which contains the line terminator.

Is there a grand reason for this?


Currently there are a few drawbacks with this. The naive user doesn't 
expect it, and the seasoned user has to keep stripping it. And then he 
has to search the docs (or get hold of other OSs) to determine what 
terminator to expect on other systems.

And it can't really be a speed optimization either, because to do 
anything useful with a string, you have to strip the terminator anyway 
at some point.

Apr 13 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Georg Wrede wrote:
 Readln returns a string which contains the line terminator.
 
 Is there a grand reason for this?
 
 
 Currently there are a few drawbacks with this. The naive user doesn't
 expect it, and the seasoned user has to keep stripping it. And then he
 has to search the docs (or get hold of other OSs) to determine what
 terminator to expect on other systems.
 
 And it can't really be a speed optimization either, because to do
 anything useful with a string, you have to strip the terminator anyway
 at some point.

Because if it stripped it, there's no way to know what it was.  If you
want to do per-line processing but don't want to clobber the line
endings, readln has to return the line terminator.

Besides which, it's a single function call to strip it off irrespective
of OS.

  -- Daniel

Apr 14 2009

Walter Bright <newshound1 digitalmars.com> writes:

Daniel Keep wrote:
 Because if it stripped it, there's no way to know what it was.  If you
 want to do per-line processing but don't want to clobber the line
 endings, readln has to return the line terminator.

That's right; there are currently at least 6 different line terminators:

CR
LF
CRLF
FF
PS
LS

Apr 14 2009

Georg Wrede <georg.wrede iki.fi> writes:

Walter Bright wrote:
 Daniel Keep wrote:
 Because if it stripped it, there's no way to know what it was.  If you
 want to do per-line processing but don't want to clobber the line
 endings, readln has to return the line terminator.


Who wants to receive a line with varying line endings anyway???

 That's right; there are currently at least 6 different line terminators:
 
 CR
 LF
 CRLF
 FF
 PS
 LS

So the programmer who wants to write portable code, has to implement 
awareness for all of these cases, in each of his programs?

This seems a bit laborious. Replacing stuff at the end of the string 
forces him to check, for *each* line, the length of the terminator, and 
then use ...$-1 and at other times ...$-2, etc. in his code.

In 25 years of computing, I have yet to see a file where variation of 
line termintators in the file contained some /deliberate/ information. 
And the only purpose for keeping the line endings would be to edit files 
while preserving the particular line terminator for each line. Which 
raises the question, how do you decide which terminator to use if you've 
inserted a line?

So the whole point is absurd. A reasonable default behavior for a file 
mongering program would be to output line terminators according to the 
operating system default. The case where one *wants* to preserve them, 
should be considered the exception.


I'm simply asking for the default to be to strip the terminator, thus 
relieving the programmer from, imho, gratuituos labor. You can still 
preserve the current functionality as an option.

Apr 14 2009

BCS <ao pathlink.com> writes:

Reply to Georg,

 So the whole point is absurd. A reasonable default behavior for a file
 mongering program would be to output line terminators according to the
 operating system default. The case where one *wants* to preserve them,
 should be considered the exception.
 

Only if you considering wanting to maintain merge-ability/diff-ability as 
the exception. Some, if not most, source control/diff/merge tools consider 
changes in line endings as changes.

Apr 14 2009

Georg Wrede <georg.wrede iki.fi> writes:

BCS wrote:
 Reply to Georg,
 
 So the whole point is absurd. A reasonable default behavior for a file
 mongering program would be to output line terminators according to the
 operating system default. The case where one *wants* to preserve them,
 should be considered the exception.

 
 Only if you considering wanting to maintain merge-ability/diff-ability 
 as the exception. Some, if not most, source control/diff/merge tools 
 consider changes in line endings as changes.

Doesn't this kind of prove my point? Changing a line ending /should not/ 
be a "difference". Not by default. They should have a switch to 
explicitly turn it on.

A good diff is complex enough that it should not stumble on form when it 
is supposed to examine content.

Apr 14 2009

BCS <ao pathlink.com> writes:

Reply to Georg,

 BCS wrote:
 
 Only if you considering wanting to maintain
 merge-ability/diff-ability as the exception. Some, if not most,
 source control/diff/merge tools consider changes in line endings as
 changes.
 

 Doesn't this kind of prove my point? Changing a line ending /should
 not/ be a "difference". Not by default. They should have a switch to
 explicitly turn it on.
 
 A good diff is complex enough that it should not stumble on form when
 it is supposed to examine content.
 

I make no assertions about what should be, only what is.

Apr 14 2009

bearophile <bearophileHUGS lycos.com> writes:

Georg Wrede:
 This seems a bit laborious. Replacing stuff at the end of the string 
 forces him to check, for *each* line, the length of the terminator, and 
 then use ...$-1 and at other times ...$-2, etc. in his code.

You use a string function or string method that removes the eventually present
ending newline, any kind of. There is one in std.string too. Its main problem
(beside working with char[] only in D1) is that its name is too much similar to
another string function. I have complained about this time ago.

Regarding the newline at the end of lines, in Python:
for line in file("somefilename.txt"):
    print line
line contains the ending new line too.

Bye,
bearophile

Apr 14 2009

Georg Wrede <georg.wrede iki.fi> writes:

bearophile wrote:
 Georg Wrede:
 This seems a bit laborious. Replacing stuff at the end of the string 
 forces him to check, for *each* line, the length of the terminator, and 
 then use ...$-1 and at other times ...$-2, etc. in his code.

 
 You use a string function or string method that removes the eventually present
ending newline, any kind of. There is one in std.string too. Its main problem
(beside working with char[] only in D1) is that its name is too much similar to
another string function. I have complained about this time ago.
 
 Regarding the newline at the end of lines, in Python:
 for line in file("somefilename.txt"):
     print line
 line contains the ending new line too.

Your code ends up printing the output on every other line.

Apr 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Georg Wrede wrote:
 In 25 years of computing, I have yet to see a file where variation of 
 line termintators in the file contained some /deliberate/ information.

25 years and no networking code?

Andrei

Apr 14 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 14 Apr 2009 13:19:49 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Georg Wrede wrote:
 In 25 years of computing, I have yet to see a file where variation of  
 line termintators in the file contained some /deliberate/ information.

 25 years and no networking code?

Been writing code for about 12 years, lots and lots of networking code.   
Still have never seen this.  Don't see your point either.

-Steve

Apr 14 2009

Sean Kelly <sean invisibleduck.org> writes:

Steven Schveighoffer wrote:
 On Tue, 14 Apr 2009 13:19:49 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 Georg Wrede wrote:
 In 25 years of computing, I have yet to see a file where variation of 
 line termintators in the file contained some /deliberate/ information.

 25 years and no networking code?

 
 Been writing code for about 12 years, lots and lots of networking code.  
 Still have never seen this.  Don't see your point either.

With HTTP, for example, lines are terminated with \r\n.  The lines 
themselves (in the header, at least) have constraints on the character 
range they allow, so one might want to error on solo \n but break on a 
\r\n, etc.  Still, I don't know why anyone would use readln() for 
processing a network protocol, so perhaps the issue is moot.

Apr 14 2009

Georg Wrede <georg.wrede iki.fi> writes:

Andrei Alexandrescu wrote:
 Georg Wrede wrote:
 In 25 years of computing, I have yet to see a file where variation of 
 line termintators in the file contained some /deliberate/ information.

 
 25 years and no networking code?

I can see having to use one or another line ending in the whole output 
file, but not a situation where some lines and not some other need this 
or that kind of line ending.

Apr 14 2009

"Nick Sabalausky" <a a.a> writes:

"Georg Wrede" <georg.wrede iki.fi> wrote in message 
news:gs2o15$233h$2 digitalmars.com...
 I can see having to use one or another line ending in the whole output 
 file, but not a situation where some lines and not some other need this or 
 that kind of line ending.

Source code with unescaped nl's/cr's embedded in a string literal? Though I 
admit that may not be a particularly compelling case for at least a couple 
of different reasons. (I do agree with your original point though.)

Apr 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Nick Sabalausky wrote:
 "Georg Wrede" <georg.wrede iki.fi> wrote in message 
 news:gs2o15$233h$2 digitalmars.com...
 I can see having to use one or another line ending in the whole output 
 file, but not a situation where some lines and not some other need this or 
 that kind of line ending.

 
 Source code with unescaped nl's/cr's embedded in a string literal? Though I 
 admit that may not be a particularly compelling case for at least a couple 
 of different reasons. (I do agree with your original point though.) 

I think there are a few concerns when designing an API for reading 
separated lines.

1. Reasonably complex separators should be allowed, e.g. regexes. For 
streams that have lookahead = 1, only regexes without backtracking 
(i.e., classic regular expressions) can be allowed.

2. Alternate separators should be allowed, and information should be 
passed as to which one, if any, matched:

readln(stream, '\n', '\r', "Brought to you by Carl's Jr.\n");

You should be able to somehow extract which one of these matched, or 
whether the stream ended without having seen any. The match process is 
similar to regexes, but the information returned would be difficult to 
extract from a regex match.

3. Given (1) and (2), the process of eliminating the matched separator 
can become rather involved. So there should be an option to just 
eliminate the separator.

4. However, the separator should be made available to the called. That 
makes for programs that preserve the separator, whatever it was.

I plan to implement a little API around these considerations, but 
haven't gotten around to it. Particularly the regex thing is rather 
thorny because std.regex does not distinguish classic regular 
expressions from those needing backtracking, and does not have an 
implementation that works with limited-lookahead streams. I suspect that 
that would be a major effort.

Right now readln preserves the separator. The newer File.byLine 
eliminates it by default and offers to keep it by calling 
File.byLine(KeepTerminator.yes). The allowed terminators are one 
character or a string. See

http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

I consider such an API adequate but insufficient; we need to add to it.


Andrei

Apr 14 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Andrei Alexandrescu wrote:
 ...
 
 Right now readln preserves the separator. The newer File.byLine
 eliminates it by default and offers to keep it by calling
 File.byLine(KeepTerminator.yes). The allowed terminators are one
 character or a string. See
 
 http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine
 
 I consider such an API adequate but insufficient; we need to add to it.
 
 
 Andrei

Why not:

char[] line, sep;
line = File.byLine();    // discard sep
line = File.byLine(sep); // pass sep out

The separator is likely to be more useful once extracted.

  -- Daniel

Apr 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Daniel Keep wrote:
 
 Andrei Alexandrescu wrote:
 ...

 Right now readln preserves the separator. The newer File.byLine
 eliminates it by default and offers to keep it by calling
 File.byLine(KeepTerminator.yes). The allowed terminators are one
 character or a string. See

 http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

 I consider such an API adequate but insufficient; we need to add to it.


 Andrei

 
 Why not:
 
 char[] line, sep;
 line = File.byLine();    // discard sep
 line = File.byLine(sep); // pass sep out
 
 The separator is likely to be more useful once extracted.

And how about when sep is elaborate (e.g. regex)?

Andrei

Apr 14 2009

Daniel Keep <daniel.keep.lists gmail.com> writes:

Andrei Alexandrescu wrote:
 Daniel Keep wrote:
 Andrei Alexandrescu wrote:
 ...

 Right now readln preserves the separator. The newer File.byLine
 eliminates it by default and offers to keep it by calling
 File.byLine(KeepTerminator.yes). The allowed terminators are one
 character or a string. See

 http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

 I consider such an API adequate but insufficient; we need to add to it.


 Andrei

 Why not:

 char[] line, sep;
 line = File.byLine();    // discard sep
 line = File.byLine(sep); // pass sep out

 The separator is likely to be more useful once extracted.

 
 And how about when sep is elaborate (e.g. regex)?
 
 Andrei

Whatever was matched.  If we have a file containing:

"A.B,C"

And we split lines using /[.,]/, then this:

 char[] line, sep;
 line = File.byLine(sep);
 while( line != "" )
 {
     writefln(`line = "%s", sep = "%s"`, line, sep);
     line = File.byLine(sep);
 }

Would output this:

 line = "A", sep = "."
 line = "B", sep = ","
 line = "C", sep = ""

  -- Daniel

Apr 14 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Daniel Keep wrote:
 
 Andrei Alexandrescu wrote:
 Daniel Keep wrote:
 Andrei Alexandrescu wrote:
 ...

 Right now readln preserves the separator. The newer File.byLine
 eliminates it by default and offers to keep it by calling
 File.byLine(KeepTerminator.yes). The allowed terminators are one
 character or a string. See

 http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

 I consider such an API adequate but insufficient; we need to add to it.


 Andrei

 Why not:

 char[] line, sep;
 line = File.byLine();    // discard sep
 line = File.byLine(sep); // pass sep out

 The separator is likely to be more useful once extracted.

 And how about when sep is elaborate (e.g. regex)?

 Andrei

 
 Whatever was matched.  If we have a file containing:
 
 "A.B,C"
 
 And we split lines using /[.,]/, then this:
 
 char[] line, sep;
 line = File.byLine(sep);
 while( line != "" )
 {
     writefln(`line = "%s", sep = "%s"`, line, sep);
     line = File.byLine(sep);
 }

 
 Would output this:
 
 line = "A", sep = "."
 line = "B", sep = ","
 line = "C", sep = ""

 
   -- Daniel

Where did you specify the separator in the call to byLine?

Andrei

Apr 14 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 15 Apr 2009 00:21:48 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Daniel Keep wrote:
  Andrei Alexandrescu wrote:
 Daniel Keep wrote:
 Andrei Alexandrescu wrote:
 ...

 Right now readln preserves the separator. The newer File.byLine
 eliminates it by default and offers to keep it by calling
 File.byLine(KeepTerminator.yes). The allowed terminators are one
 character or a string. See

 http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

 I consider such an API adequate but insufficient; we need to add to  
 it.


 Andrei

 Why not:

 char[] line, sep;
 line = File.byLine();    // discard sep
 line = File.byLine(sep); // pass sep out

 The separator is likely to be more useful once extracted.

 And how about when sep is elaborate (e.g. regex)?

 Andrei

  Whatever was matched.  If we have a file containing:
  "A.B,C"
  And we split lines using /[.,]/, then this:

 char[] line, sep;
 line = File.byLine(sep);
 while( line != "" )
 {
     writefln(`line = "%s", sep = "%s"`, line, sep);
     line = File.byLine(sep);
 }

  Would output this:

 line = "A", sep = "."
 line = "B", sep = ","
 line = "C", sep = ""

    -- Daniel

 Where did you specify the separator in the call to byLine?

I think he's not read the docs.  Consider this usage instead:

auto reader = file.byLine!("/[.,]/")();
// normal usage, doesn't return separators
foreach(line; reader)
{
...
}

// alternate usage, returns separators as well
while(!reader.empty)
{
   char[] sep;
   char[] line = reader.front(sep); // can't remember if this is what you  
decided on.
   ...
   reader.popFront(); // ditto
}

//Note that if foreach on ranges was extended to allow multiple parameters  
per pass, you could do:

foreach(sep, line; reader)
{
...
}

-Steve

Apr 14 2009

Christopher Wright <dhasenan gmail.com> writes:

Steven Schveighoffer wrote:
 auto reader = file.byLine!("/[.,]/")();

Why specify anything at compile time when a user could reasonably 
generate the value at runtime?

auto reader = file.byLine(readConfig().separator);

Apr 15 2009

Robert Fraser <fraserofthenight gmail.com> writes:

Christopher Wright wrote:
 Steven Schveighoffer wrote:
 auto reader = file.byLine!("/[.,]/")();

 
 Why specify anything at compile time when a user could reasonably 
 generate the value at runtime?
 
 auto reader = file.byLine(readConfig().separator);

Yes, and for maximum abstraction, the config file should be stored as 
XML in a TEXT field of a database on another server.

Apr 15 2009

Christopher Wright <dhasenan gmail.com> writes:

Robert Fraser wrote:
 Christopher Wright wrote:
 Steven Schveighoffer wrote:
 auto reader = file.byLine!("/[.,]/")();

 Why specify anything at compile time when a user could reasonably 
 generate the value at runtime?

 auto reader = file.byLine(readConfig().separator);

 
 Yes, and for maximum abstraction, the config file should be stored as 
 XML in a TEXT field of a database on another server.

I just really hate to see templates when a regular function would 
suffice and be so close to the same efficiency as makes no difference 
for most reasonable situations. If there's a significant performance 
increase, I want to see both options.

Apr 15 2009

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 15 Apr 2009 22:54:50 -0400, Christopher Wright  
<dhasenan gmail.com> wrote:

 Robert Fraser wrote:
 Christopher Wright wrote:
 Steven Schveighoffer wrote:
 auto reader = file.byLine!("/[.,]/")();

 Why specify anything at compile time when a user could reasonably  
 generate the value at runtime?

 auto reader = file.byLine(readConfig().separator);

  Yes, and for maximum abstraction, the config file should be stored as  
 XML in a TEXT field of a database on another server.

 I just really hate to see templates when a regular function would  
 suffice and be so close to the same efficiency as makes no difference  
 for most reasonable situations. If there's a significant performance  
 increase, I want to see both options.

It's just a demonstration of what the OP was talking about but wasn't  
explaining properly.  I have no intention of writing or supporting this  
code.

I think its fine if Andrei decides to write this code and uses a function  
parameter instead of a template parameter, that I used a template  
parameter instead of a function parameter is not a hidden suggestion.

-Steve

Apr 15 2009

Georg Wrede <georg.wrede iki.fi> writes:

Andrei Alexandrescu wrote:
 I plan to implement a little API around these considerations, but 
 haven't gotten around to it. Particularly the regex thing is rather 
 thorny because std.regex does not distinguish classic regular 
 expressions from those needing backtracking, and does not have an 
 implementation that works with limited-lookahead streams. I suspect that 
 that would be a major effort.
 
 Right now readln preserves the separator. The newer File.byLine 
 eliminates it by default and offers to keep it by calling 

Excellent!!

 File.byLine(KeepTerminator.yes). The allowed terminators are one 
 character or a string. See
 
 http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

 I consider such an API adequate but insufficient; we need to add to it.

Apr 15 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Daniel Keep wrote:
 Georg Wrede wrote:
 Readln returns a string which contains the line terminator.


<snip>
 Because if it stripped it, there's no way to know what it was.  If you
 want to do per-line processing but don't want to clobber the line
 endings, readln has to return the line terminator.

But readln only stops on '\n' (or whatever character you tell it to 
otherwise), so will miss Mac "\r" endings altogether.  As such, it's 
useless for this purpose.

The big question, however, is why std.stream.InputStream doesn't have 
readln.  It has readLine, which has different semantics - it understands 
all three line break styles and strips them.  This is absurd since 
you're more likely to care about what line ending is used when reading 
in a text file than when reading from stdin.

Take these four cases:
(a) you want to process only files with a specific line ending style
(b) you want to know what line endings are used
(c) you don't care about what line endings are used, but still want to 
know whether or not the file ends with one
(d) you just want to read the file line by line, without caring about 
the line endings or the presence or absence of one at the end

At the moment, readln is good only for (a).  readLine is good only for 
(d).  If you want (b) or (c), you'll have to come up with an alternative 
means.

Stewart.

Apr 14 2009

Christopher Wright <dhasenan gmail.com> writes:

Georg Wrede wrote:
 Readln returns a string which contains the line terminator.
 
 Is there a grand reason for this?
 
 
 Currently there are a few drawbacks with this. The naive user doesn't 
 expect it, and the seasoned user has to keep stripping it. And then he 
 has to search the docs (or get hold of other OSs) to determine what 
 terminator to expect on other systems.
 
 And it can't really be a speed optimization either, because to do 
 anything useful with a string, you have to strip the terminator anyway 
 at some point.

By default, tango does not exhibit this behavior. If you wish, you can 
include newlines:

auto str = Cin.copyln; // no newline in str
auto str2 = Cin.copyln(true); // has system-dependent newline

Apr 14 2009

Georg Wrede <georg.wrede iki.fi> writes:

Christopher Wright wrote:
 Georg Wrede wrote:
 Readln returns a string which contains the line terminator.

 Is there a grand reason for this?


 Currently there are a few drawbacks with this. The naive user doesn't 
 expect it, and the seasoned user has to keep stripping it. And then he 
 has to search the docs (or get hold of other OSs) to determine what 
 terminator to expect on other systems.

 And it can't really be a speed optimization either, because to do 
 anything useful with a string, you have to strip the terminator anyway 
 at some point.

 
 By default, tango does not exhibit this behavior. If you wish, you can 
 include newlines:
 
 auto str = Cin.copyln; // no newline in str
 auto str2 = Cin.copyln(true); // has system-dependent newline

Now this is more like it. The default should really be (in Phobos too) 
to not return the newline. (Hint to Walter: Tango is for users, by 
users, and if they have no newline as the default, it should be 
considered a serious hint as to what the programmer prefers.)

If one is really interested in doing some file manipulation which might 
*preserve* varying line terminators in files that might have been edited 
in both linux and dos, then he should use "the non-default" line 
reading, like the Cin.copyln(true) above. Not that I'd see the point.

I'm certain that the overwhelming majority of cases where one reads 
lines (_especially_ from the console, but from text files, too), one 
just wants the contents of the string.

Apr 14 2009

Manfred Nowak <svv1999 hotmail.com> writes:

Georg Wrede wrote:

 because to do anything useful with a string, you have to strip the
 terminator

This is false in case of simple copying. And I doubt, that for more 
complex operations splitting `readln' into `readlnBody' and 
`readlnEOL' and calling them intermittent would be of any benefit.

-manfred

Apr 14 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Tue, 14 Apr 2009 18:01:52 +0400, Manfred Nowak <svv1999 hotmail.com> wrote:

 Georg Wrede wrote:

 because to do anything useful with a string, you have to strip the
 terminator

 This is false in case of simple copying. And I doubt, that for more
 complex operations splitting `readln' into `readlnBody' and
 `readlnEOL' and calling them intermittent would be of any benefit.

 -manfred

Tango does the best by having an optional parameter that denotes whether a line
ending needs to be retained.

Apr 14 2009

Georg Wrede <georg.wrede iki.fi> writes:

Manfred Nowak wrote:
 Georg Wrede wrote:
 
 because to do anything useful with a string, you have to strip the
 terminator

 
 This is false in case of simple copying. And I doubt, that for more 
 complex operations splitting `readln' into `readlnBody' and 
 `readlnEOL' and calling them intermittent would be of any benefit.

For copying there is the operating system command, copy.

Additionally, simple copy is hardly the most used thing when readln is 
invoked.

So, either, there should be two functions, one of which preserves the 
terminator, or (like in Tango) there should be a parameter to turn them on.

Apr 14 2009

Manfred Nowak <svv1999 hotmail.com> writes:

Georg Wrede wrote:

 So, either, there should be [...]

Agreed.

-manfred

Apr 14 2009

Kagamin <spam here.lot> writes:

Stewart Gordon Wrote:

 Take these four cases:
 (a) you want to process only files with a specific line ending style
 (b) you want to know what line endings are used
 (c) you don't care about what line endings are used, but still want to 
 know whether or not the file ends with one
 (d) you just want to read the file line by line, without caring about 
 the line endings or the presence or absence of one at the end
 
 At the moment, readln is good only for (a).  readLine is good only for 
 (d).  If you want (b) or (c), you'll have to come up with an alternative 
 means.

I think, only (d) is important, all others are *strange* things. I usually use
ReadLine in conjunction with WriteLine.

Apr 15 2009

Stewart Gordon <smjg_1998 yahoo.com> writes:

Kagamin wrote:
 Stewart Gordon Wrote:
 
 Take these four cases:
 (a) you want to process only files with a specific line ending style
 (b) you want to know what line endings are used
 (c) you don't care about what line endings are used, but still want to 
 know whether or not the file ends with one
 (d) you just want to read the file line by line, without caring about 
 the line endings or the presence or absence of one at the end

 At the moment, readln is good only for (a).  readLine is good only for 
 (d).  If you want (b) or (c), you'll have to come up with an alternative 
 means.

 
 I think, only (d) is important, all others are *strange* things. I 
 usually use ReadLine in conjunction with WriteLine.

So you expect text editors to discard both kinds of information?

I expect any text editor (don't get me started on Notepad) to do (c), 
and any decent text editor to be capable of (b).

Stewart.

Apr 15 2009

Kagamin <spam here.lot> writes:

Stewart Gordon Wrote:

 So you expect text editors to discard both kinds of information?

No. Text editor is a *specialized* text processing tool and it usually uses
specialized text processing algorithms. Otherwise it is *quite* decent.

Apr 16 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Why does readln include the line terminator?