www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Difference between chunks(stdin, 1) and stdin.rawRead?

reply jms <jersni gmail.com> writes:
Why in the below silly program am I reading both the \r and \n 
characters when using rawRead in block a, but when looping by 1 
byte chunks in block b only appear to be reading the \n 
characters?

I'm on Windows 11 using DMD64 D Compiler v2.107.1 if that 
matters, but I'm thinking this maybe has something to do with 
stdin in general that I'm not aware of. Any pointers to 
understanding what's going on would be appreciated.

import std.stdio;

void main() {
     int i;
a: {
         i = 0;
         writeln("\nin a");
         ubyte[1] buffer;
         while (true) {
             i++;
             stdin.rawRead(buffer);
             if (buffer[0] == 13) {
                 write("CR");
             } else if (buffer[0] == 10) {
                 write("LF");
             }
             if (i > 5) {
                 goto b;
             }

         }
     }
b: {

         writeln("\n\nin b");
         i = 0;
         foreach (ubyte[] buffer; chunks(stdin, 1)) {
             i++;
             if (buffer[0] == 13) {
                 write("cr");
             } else if (buffer[0] == 10) {
                 write("lf");
             }
             if (i > 5) {
                 goto a;
             }
         }
     }

}



Output:
in a

CRLF
CRLF
CRLF

in b

lf
lf
lf
lf
lf
lf
in a
Mar 27
parent reply jms <jersni gmail.com> writes:
On Thursday, 28 March 2024 at 02:30:11 UTC, jms wrote:
 Why in the below silly program am I reading both the \r and \n 
 characters when using rawRead in block a, but when looping by 1 
 byte chunks in block b only appear to be reading the \n 
 characters?

 I'm on Windows 11 using DMD64 D Compiler v2.107.1 if that 
 matters, but I'm thinking this maybe has something to do with 
 stdin in general that I'm not aware of. Any pointers to 
 understanding what's going on would be appreciated.

 import std.stdio;

 void main() {
     int i;
 a: {
         i = 0;
         writeln("\nin a");
         ubyte[1] buffer;
         while (true) {
             i++;
             stdin.rawRead(buffer);
             if (buffer[0] == 13) {
                 write("CR");
             } else if (buffer[0] == 10) {
                 write("LF");
             }
             if (i > 5) {
                 goto b;
             }

         }
     }
 b: {

         writeln("\n\nin b");
         i = 0;
         foreach (ubyte[] buffer; chunks(stdin, 1)) {
             i++;
             if (buffer[0] == 13) {
                 write("cr");
             } else if (buffer[0] == 10) {
                 write("lf");
             }
             if (i > 5) {
                 goto a;
             }
         }
     }

 }



 Output:
 in a

 CRLF
 CRLF
 CRLF

 in b

 lf
 lf
 lf
 lf
 lf
 lf
 in a
I think I figured it out and the difference is probably in the mode. This documentation https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference fread?view=msvc-170 mentions that "If the given stream is opened in text mode, Windows-style newlines are converted into Unix-style newlines. That is, carriage return-line feed (CRLF) pairs are replaced by single line feed (LF) characters." And rawRead's documention mentions that "rawRead always reads in binary mode on Windows.", which I guess should have given me a clue. chunks must be using text-mode.
Mar 28
parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Thu, Mar 28, 2024 at 10:10:43PM +0000, jms via Digitalmars-d-learn wrote:
 On Thursday, 28 March 2024 at 02:30:11 UTC, jms wrote:
[...]
 I think I figured it out and the difference is probably in the mode.
 This documentation
 https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fread?view=msvc-170
 mentions that "If the given stream is opened in text mode,
 Windows-style newlines are converted into Unix-style newlines. That
 is, carriage return-line feed (CRLF) pairs are replaced by single line
 feed (LF) characters."
 
 And rawRead's documention mentions that "rawRead always reads in
 binary mode on Windows.", which I guess should have given me a clue.
 chunks must be using text-mode.
It's not so much that chunks is using text-mode, but that you opened the file in text mode. On Windows, if you don't want crlf translation you need to open your file with File(filename, "rb"), not just File(filename "r"), because the latter defaults to text mode. T -- There's light at the end of the tunnel. It's the oncoming train.
Mar 28