www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - I/O extensions for common tasks

reply Andrew Pennebaker <andrew.pennebaker gmail.com> writes:
The stdlib has all the low-level components we need to do lots of 
different workflows. However, there are a few gaps in the API in 
terms of high level, common tasks. For example, the API can read 
an entire Unicode text file to a string for a given filename, but 
not for a given File object!

For now, I am accomplishing this with:

// Convenience function for reading an entire UTF-8 file.
string readUTF8(File f) {
     string s;

     foreach(ubyte[] buf; f.byChunk(1024)) {
         s ~= buf;
     }

     return s;
}

Maybe not the best code, not much error handling and whatnot. 
Could we add something like this (but better code), to the 
standard lib?

Another little gap occurs when passing environment variables to 
process constructors, e.g. pipedProcess: There is no enum defined 
to represent the default case / NO redirects. For now, I am 
copying the raw source code here, using:

cast(Redirect)7

as mentioned on the pipeProcess page.

https://dlang.org/library/std/process/pipe_process.html

This is risky, as this magic value could change over time as the 
API evolves. Could we declare a stable enum name for the default 
case?

Another API improvement for process creation, that could be done 
independently of this enum work, would be to add keyword argument 
defaults, e.g. for pipeProcess(). I think this could be done 
without breaking backwards compatilbility. And it gives the user 
another way to set things like environment variables, while 
continuing to use the default values for the other fields. What 
do you think?
Dec 09 2018
next sibling parent reply Andrew Pennebaker <andrew.pennebaker gmail.com> writes:
On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker 
wrote:
 The stdlib has all the low-level components we need to do lots 
 of different workflows. However, there are a few gaps in the 
 API in terms of high level, common tasks. For example, the API 
 can read an entire Unicode text file to a string for a given 
 filename, but not for a given File object!

 For now, I am accomplishing this with:

 // Convenience function for reading an entire UTF-8 file.
 string readUTF8(File f) {
     string s;

     foreach(ubyte[] buf; f.byChunk(1024)) {
         s ~= buf;
     }

     return s;
 }

 Maybe not the best code, not much error handling and whatnot. 
 Could we add something like this (but better code), to the 
 standard lib?

 Another little gap occurs when passing environment variables to 
 process constructors, e.g. pipedProcess: There is no enum 
 defined to represent the default case / NO redirects. For now, 
 I am copying the raw source code here, using:

 cast(Redirect)7

 as mentioned on the pipeProcess page.

 https://dlang.org/library/std/process/pipe_process.html

 This is risky, as this magic value could change over time as 
 the API evolves. Could we declare a stable enum name for the 
 default case?

 Another API improvement for process creation, that could be 
 done independently of this enum work, would be to add keyword 
 argument defaults, e.g. for pipeProcess(). I think this could 
 be done without breaking backwards compatilbility. And it gives 
 the user another way to set things like environment variables, 
 while continuing to use the default values for the other 
 fields. What do you think?
Update: Looks like cast(Redirect) 7 is the default, cast(Redirect) 0 is being interpreted as NO redirects. Would be good to use enums for both of these!
Dec 09 2018
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Monday, 10 December 2018 at 01:59:58 UTC, Andrew Pennebaker 
wrote:
 Would be good to use enums for both of these!
It is, for all at least, you are just looking at buggy docs. Use mine instead, much more readable: http://dpldocs.info/experimental-docs/std.process.pipeProcess.1.html The proper name is Redirect.all. Though indeed, there does not seem Redirect.none, there you would just use 0.
Dec 09 2018
next sibling parent reply Arun Chandrasekaran <aruncxy gmail.com> writes:
On Monday, 10 December 2018 at 02:15:20 UTC, Adam D. Ruppe wrote:
 On Monday, 10 December 2018 at 01:59:58 UTC, Andrew Pennebaker 
 wrote:
 Would be good to use enums for both of these!
It is, for all at least, you are just looking at buggy docs. Use mine instead, much more readable: http://dpldocs.info/experimental-docs/std.process.pipeProcess.1.html The proper name is Redirect.all. Though indeed, there does not seem Redirect.none, there you would just use 0.
If both your and the formal docs are generated from the same source, how are the formal docs buggy? Any way to fix it?
Dec 09 2018
parent Arun Chandrasekaran <aruncxy gmail.com> writes:
On Monday, 10 December 2018 at 04:00:39 UTC, Arun Chandrasekaran 
wrote:
 On Monday, 10 December 2018 at 02:15:20 UTC, Adam D. Ruppe 
 wrote:
 On Monday, 10 December 2018 at 01:59:58 UTC, Andrew Pennebaker 
 wrote:
 Would be good to use enums for both of these!
It is, for all at least, you are just looking at buggy docs. Use mine instead, much more readable: http://dpldocs.info/experimental-docs/std.process.pipeProcess.1.html The proper name is Redirect.all. Though indeed, there does not seem Redirect.none, there you would just use 0.
If both your and the formal docs are generated from the same source, how are the formal docs buggy? Any way to fix it?
Bugs aside, your docs are much more readable and context aware!
Dec 09 2018
prev sibling parent Kagamin <spam here.lot> writes:
On Monday, 10 December 2018 at 02:15:20 UTC, Adam D. Ruppe wrote:
 http://dpldocs.info/experimental-docs/std.process.pipeProcess.1.html

 The proper name is Redirect.all. Though indeed, there does not 
 seem Redirect.none, there you would just use 0.
pipeProcess without redirects sounds like an oxymoron.
Dec 10 2018
prev sibling next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker 
wrote:
 The stdlib has all the low-level components we need to do lots 
 of different workflows. However, there are a few gaps in the 
 API in terms of high level, common tasks. For example, the API 
 can read an entire Unicode text file to a string for a given 
 filename, but not for a given File object!

 For now, I am accomplishing this with:

 // Convenience function for reading an entire UTF-8 file.
 string readUTF8(File f) {
     string s;

     foreach(ubyte[] buf; f.byChunk(1024)) {
         s ~= buf;
     }

     return s;
 }
There's more simple: auto wholeFile = f.rawRead(new char[](f.size)); Note that this kind of question should rather go there: https://forum.dlang.org/group/learn
Dec 09 2018
parent Andrew Pennebaker <andrew.pennebaker gmail.com> writes:
On Monday, 10 December 2018 at 02:06:26 UTC, Basile B. wrote:
 On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker 
 wrote:
 The stdlib has all the low-level components we need to do lots 
 of different workflows. However, there are a few gaps in the 
 API in terms of high level, common tasks. For example, the API 
 can read an entire Unicode text file to a string for a given 
 filename, but not for a given File object!

 For now, I am accomplishing this with:

 // Convenience function for reading an entire UTF-8 file.
 string readUTF8(File f) {
     string s;

     foreach(ubyte[] buf; f.byChunk(1024)) {
         s ~= buf;
     }

     return s;
 }
There's more simple: auto wholeFile = f.rawRead(new char[](f.size)); Note that this kind of question should rather go there: https://forum.dlang.org/group/learn
For pipedProcess streams, the size is basically infinite until the process ends, so that leads to an out of memory error. Again, I think a convenience function would be good to include, to avoid these kinds of mistakes!
Dec 10 2018
prev sibling parent reply Seb <seb wilzba.ch> writes:
On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker 
wrote:
 The stdlib has all the low-level components we need to do lots 
 of different workflows. However, there are a few gaps in the 
 API in terms of high level, common tasks. For example, the API 
 can read an entire Unicode text file to a string for a given 
 filename, but not for a given File object!

 [...]
What's wrong with readText: https://dlang.org/phobos/std_file.html#readText
Dec 09 2018
next sibling parent rikki cattermole <rikki cattermole.co.nz> writes:
On 10/12/2018 3:37 PM, Seb wrote:
 On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker wrote:
 The stdlib has all the low-level components we need to do lots of 
 different workflows. However, there are a few gaps in the API in terms 
 of high level, common tasks. For example, the API can read an entire 
 Unicode text file to a string for a given filename, but not for a 
 given File object!

 [...]
What's wrong with readText: https://dlang.org/phobos/std_file.html#readText
I thought the same, except it doesn't take a File.
Dec 09 2018
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/9/18 9:37 PM, Seb wrote:
 On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker wrote:
 The stdlib has all the low-level components we need to do lots of 
 different workflows. However, there are a few gaps in the API in terms 
 of high level, common tasks. For example, the API can read an entire 
 Unicode text file to a string for a given filename, but not for a 
 given File object!

 [...]
What's wrong with readText: https://dlang.org/phobos/std_file.html#readText
Problem statement is that given a File object, how do you read all the data out? :) std.file doesn't help there. BTW, if you were using iopipe, the answer would be: pipe.ensureElems(); // read all elements into the buffer [1] auto data = pipe.window; // get all the elements as an array. -Steve [1] http://schveiguy.github.io/iopipe/iopipe/bufpipe/ensureElems.html
Dec 10 2018
parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 10 December 2018 at 14:44:56 UTC, Steven Schveighoffer 
wrote:
 On 12/9/18 9:37 PM, Seb wrote:
 On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker 
 wrote:
 The stdlib has all the low-level components we need to do 
 lots of different workflows. However, there are a few gaps in 
 the API in terms of high level, common tasks. For example, 
 the API can read an entire Unicode text file to a string for 
 a given filename, but not for a given File object!

 [...]
What's wrong with readText: https://dlang.org/phobos/std_file.html#readText
Problem statement is that given a File object, how do you read all the data out? :) std.file doesn't help there.
As far as I know, the best answer is `file.byLineCopy(Yes.keepTerminator).joiner` if you want characters, and `file.byChunk(...).joiner` if you want ubytes.
Dec 10 2018
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 12/11/18 12:24 AM, Paul Backus wrote:
 On Monday, 10 December 2018 at 14:44:56 UTC, Steven Schveighoffer wrote:
 On 12/9/18 9:37 PM, Seb wrote:
 On Monday, 10 December 2018 at 01:51:56 UTC, Andrew Pennebaker wrote:
 The stdlib has all the low-level components we need to do lots of 
 different workflows. However, there are a few gaps in the API in 
 terms of high level, common tasks. For example, the API can read an 
 entire Unicode text file to a string for a given filename, but not 
 for a given File object!

 [...]
What's wrong with readText: https://dlang.org/phobos/std_file.html#readText
Problem statement is that given a File object, how do you read all the data out? :) std.file doesn't help there.
As far as I know, the best answer is `file.byLineCopy(Yes.keepTerminator).joiner` if you want characters, and `file.byChunk(...).joiner` if you want ubytes.
Well, I would say the original request wants it in a string, so we are focusing on characters. However, this does NOT work: file.byLine(Yes.keepTerminator).joiner.array; Because it will make an array of dchars. What you really want is this: cast(string)file.byChunk(4096).joiner.array; -Steve
Dec 11 2018