www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Encoding of eol in multiline wysiwyg strings

reply KlausO <oberhofer users.sf.net> writes:
Hello,

does the D specification specify how the "end of line" is encoded when 
you use wysiwyg strings. Currently it seems to be '\n' on windows
(And I guess it will '\n' on linux, too.).
Is this the intended behaviour ?
It's not a big issue but somtimes when you use wysiwyg strings, string 
concatenation and import expressions to combine some text the result is 
a string with mixed EOL encodings.
Thanks for clarifying,

KlausO
Feb 17 2009
parent reply Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Feb 17, 2009 at 4:41 AM, KlausO <oberhofer users.sf.net> wrote:
 Hello,

 does the D specification specify how the "end of line" is encoded when you
 use wysiwyg strings. Currently it seems to be '\n' on windows
 (And I guess it will '\n' on linux, too.).
 Is this the intended behaviour ?
http://www.digitalmars.com/d/1.0/lex.html "Wysiwyg Strings Wysiwyg quoted strings are enclosed by r" and ". All characters between the r" and " are part of the string except for EndOfLine which is regarded as a single \n character."
 It's not a big issue but somtimes when you use wysiwyg strings, string
 concatenation and import expressions to combine some text the result is a
 string with mixed EOL encodings.
 Thanks for clarifying,
It's the import() expression that's messing things up. It just loads the file verbatim and does no line-ending conversions.
Feb 17 2009
parent reply grauzone <none example.net> writes:
Jarrett Billingsley wrote:
 On Tue, Feb 17, 2009 at 4:41 AM, KlausO <oberhofer users.sf.net> wrote:
 Hello,

 does the D specification specify how the "end of line" is encoded when you
 use wysiwyg strings. Currently it seems to be '\n' on windows
 (And I guess it will '\n' on linux, too.).
 Is this the intended behaviour ?
http://www.digitalmars.com/d/1.0/lex.html "Wysiwyg Strings Wysiwyg quoted strings are enclosed by r" and ". All characters between the r" and " are part of the string except for EndOfLine which is regarded as a single \n character."
 It's not a big issue but somtimes when you use wysiwyg strings, string
 concatenation and import expressions to combine some text the result is a
 string with mixed EOL encodings.
 Thanks for clarifying,
It's the import() expression that's messing things up. It just loads the file verbatim and does no line-ending conversions.
But many people would like to use import() to read binary data. I guess one could extend the language specification to solve this: //load, convert line endings, check for valid UTF-8 char[] import_text(char[] filename); //return unchanged file contents as byte array ubyte[] import_binary(char[] filename); On the other hand, both could be implemented as compile-time functions using the current import().
Feb 17 2009
parent Jarrett Billingsley <jarrett.billingsley gmail.com> writes:
On Tue, Feb 17, 2009 at 10:02 AM, grauzone <none example.net> wrote:
 But many people would like to use import() to read binary data.
Oh, I'm not saying import() is in the wrong here :) just that that's where his mixed line endings are coming from.
 I guess one could extend the language specification to solve this:

 //load, convert line endings, check for valid UTF-8
 char[] import_text(char[] filename);

 //return unchanged file contents as byte array
 ubyte[] import_binary(char[] filename);

 On the other hand, both could be implemented as compile-time functions using
 the current import().
I suppose, as long as CTFE were made a bit more efficient. Can you imagine doing line-end conversions on a 20k line text file at compile time? The compiler would probably explode.
Feb 17 2009