www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The new std.process is ready for review

reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
It's been years in the coming, but we finally got it done. :)  
The upshot is that the module has actually seen active use over 
those years, both by yours truly and others, so hopefully the 
worst wrinkles are already ironed out.

Pull request:
https://github.com/D-Programming-Language/phobos/pull/1151

Code:
https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

Documentation:
http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

I hope we can get it reviewed in time for the next release.  (The 
wiki page indicates that both std.benchmark and std.uni are 
currently being reviewed, but I fail to find any "official" 
review threads on the forum.  Is the wiki just out of date?)

Lars
Feb 23 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
23-Feb-2013 15:31, Lars T. Kyllingstad пишет:
 It's been years in the coming, but we finally got it done. :) The upshot
 is that the module has actually seen active use over those years, both
 by yours truly and others, so hopefully the worst wrinkles are already
 ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html


 I hope we can get it reviewed in time for the next release.  (The wiki
 page indicates that both std.benchmark and std.uni are currently being
 reviewed, but I fail to find any "official" review threads on the
 forum.  Is the wiki just out of date?)
Cool. I was about to suggest that now, since release is out we can hopefully start the review process. As far as std.uni & std.benchmark goes: Some time ago std.benchmark was again collectively destroyed and/or Andrei hadn't enough time to incorporate feedback. This was back in 2012 IRC. This is the last thread I can dig up: http://forum.dlang.org/thread/mailman.73.1347916419.5162.digitalmars-d puremagic.com Then (in 2013) std.uni seen an informal destruction, had its docs re-written and was proposed again. Not much comments received after that, as people were fighting over property stuff. This is the first thread about it: http://forum.dlang.org/thread/kcppa1$30b9$1 digitalmars.com The second one: http://forum.dlang.org/thread/ke9gat$rg0$1 digitalmars.com I'd suggest we start with anything that's ready and the sooner the better. Now with std.process proposed for review I'm aware of 2 modules being ready, any others? -- Dmitry Olshansky
Feb 23 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 12:31:19PM +0100, Lars T. Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :)  The
 upshot is that the module has actually seen active use over those
 years, both by yours truly and others, so hopefully the worst
 wrinkles are already ironed out.
Finally!!! *applause*
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151
 
 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d
 
 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
I just looked over the docs. Looks very good! Just a few minor comments: - wait(): - Some code examples would be nice. - For the POSIX-specific version, I thought the Posix standard specifies that the actual return code / signal number should be extracted by means of system-specific macros (in C anyway)? Wouldn't it be better to encapsulate this in a POD struct or something instead of exposing the implementation-specific values to the user? - How do I wait for *any* child process to terminate, not just a specific Pid? - execute() and shell(): I'm a bit concerned about returning the *entire* output of a process as a string. What if the output generates too much output to store in a string? Would it be better to return a range instead (either a range of chars or range of lines maybe)? Or is this what pipeProcess was intended for? In any case, would it make sense to specify some kind of upper limit to the size of the output so that the program won't be vulnerable to bad subprocess behaviour (generate infinite output, etc.)? - ProcessException: are there any specific methods to help user code extract information about the error? Or is the user expected to check errno himself (on Posix; or whatever it is on Windows)?
 I hope we can get it reviewed in time for the next release.  (The
 wiki page indicates that both std.benchmark and std.uni are
 currently being reviewed, but I fail to find any "official" review
 threads on the forum.  Is the wiki just out of date?)
[...] I was *intending* to re-review std.uni, it's been sitting in my inbox for a few weeks already, but alas, I keep getting distracted by other things. I'll see if I can get to it today. Sighh... so many fun things to do, so little time... T -- If I were two-faced, would I be wearing this one? -- Abraham Lincoln
Feb 23 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 11:42:26 -0500, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 - wait():
    - Some code examples would be nice.

    - For the POSIX-specific version, I thought the Posix standard
      specifies that the actual return code / signal number should be
      extracted by means of system-specific macros (in C anyway)?
      Wouldn't it be better to encapsulate this in a POD struct or
      something instead of exposing the implementation-specific values to
      the user?
We handle the extraction as an implementation detail, the result should be cross-platform (at least on signal-using platforms). I don't know what a POD struct would get you, maybe you could elaborate what you mean?
    - How do I wait for *any* child process to terminate, not just a
      specific Pid?
I don't think we have a method to do that. It would be complex, especially if posix wait() returned a pid that we are not handling! I suppose what you could do is call posix wait (I have a feeling we may need to change our global wait function, or eliminate it), and then map the result back to a Pid you are tracking. You have any ideas how this could be implemented? I'd prefer not to keep a global cache of child process objects...
 - execute() and shell(): I'm a bit concerned about returning the
   *entire* output of a process as a string. What if the output generates
   too much output to store in a string? Would it be better to return a
   range instead (either a range of chars or range of lines maybe)? Or is
   this what pipeProcess was intended for? In any case, would it make
   sense to specify some kind of upper limit to the size of the output so
   that the program won't be vulnerable to bad subprocess behaviour
   (generate infinite output, etc.)?
Yes, pipeProcess gives you File objects for each of the streams for those cases where you expect lots of data to be returned, or want to process it as it comes. This is the use case I expect most people will use. There is no doubt good use cases for execute/shell, we have a lot of non-generic string processing functions in phobos, and a lot of command line tools on an OS produce a concise output that can be used. In general, for input streams, ranges are not a good interface. Output ranges are good for output though, and I think File is a valid output range.
 - ProcessException: are there any specific methods to help user code
   extract information about the error? Or is the user expected to check
   errno himself (on Posix; or whatever it is on Windows)?
This is a good idea. Right now, ProcessException converts the errno to a string message, but we could easily store the errno. I say we, but I really mean Lars, he has done almost all the work :) -Steve
Feb 23 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 03:15:26PM -0500, Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 11:42:26 -0500, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
- wait():
   - Some code examples would be nice.

   - For the POSIX-specific version, I thought the Posix standard
     specifies that the actual return code / signal number should be
     extracted by means of system-specific macros (in C anyway)?
     Wouldn't it be better to encapsulate this in a POD struct or
     something instead of exposing the implementation-specific values
     to the user?
We handle the extraction as an implementation detail, the result should be cross-platform (at least on signal-using platforms). I don't know what a POD struct would get you, maybe you could elaborate what you mean?
Oh, I thought the return value was just straight from the syscall, which requires WIFEXITED, WEXITSTATUS, WCOREDUMP, etc., to interpret. If it has already been suitably interpreted in std.process, then I guess it's OK. Otherwise, I was thinking of encapsulating these macros in some kind of POD struct, that provides methods like .ifExited, .exitStatus, .coreDump, etc. so that the user code doesn't have to directly play with the exact values returned by the specific OS.
   - How do I wait for *any* child process to terminate, not just a
     specific Pid?
I don't think we have a method to do that. It would be complex, especially if posix wait() returned a pid that we are not handling! I suppose what you could do is call posix wait (I have a feeling we may need to change our global wait function, or eliminate it), and then map the result back to a Pid you are tracking. You have any ideas how this could be implemented? I'd prefer not to keep a global cache of child process objects...
Why not? On Posix at least, you get SIGCHLD, etc., for all child processes anyway, so a global cache doesn't seem to be out-of-place. But you do have a point about pids that we aren't managing, e.g. if the user code is doing some fork()s on its own. But the way I see it, std.process is supposed to alleviate the need to do such things directly, so in my mind, if everything is going through std.process anyway, might as well just manage all child processes there. OTOH, this may cause problems if the D program links in C/C++ libraries that manage their own child processes. Still, it would be nice to have some way of waiting for a set of child Pids, not just a single one. It would be a pain if user code had to manually manage child processes all the time when there's more than one of them running at a time. Hmm. The more I think about it, the more it makes sense to just have std.process manage all child process related stuff. It's too painful to deal with multiple child processes otherwise. Maybe provide an opt-out in case you need to link in some C/C++ libraries that need their own child process handling, but the default, IMO, should be to manage everything through std.process.
- execute() and shell(): I'm a bit concerned about returning the
  *entire* output of a process as a string. What if the output
  generates too much output to store in a string? Would it be better
  to return a range instead (either a range of chars or range of
  lines maybe)? Or is this what pipeProcess was intended for? In any
  case, would it make sense to specify some kind of upper limit to
  the size of the output so that the program won't be vulnerable to
  bad subprocess behaviour (generate infinite output, etc.)?
Yes, pipeProcess gives you File objects for each of the streams for those cases where you expect lots of data to be returned, or want to process it as it comes. This is the use case I expect most people will use. There is no doubt good use cases for execute/shell, we have a lot of non-generic string processing functions in phobos, and a lot of command line tools on an OS produce a concise output that can be used.
True.
 In general, for input streams, ranges are not a good interface.
 Output ranges are good for output though, and I think File is a valid
 output range.
True.
- ProcessException: are there any specific methods to help user code
  extract information about the error? Or is the user expected to
  check errno himself (on Posix; or whatever it is on Windows)?
This is a good idea. Right now, ProcessException converts the errno to a string message, but we could easily store the errno. I say we, but I really mean Lars, he has done almost all the work :)
[...] I never liked the design of errno in C... its being a global makes keeping track of errors a pain. It would be nice if the value of errno were saved in the Exception object at the time the error was encountered, instead of arbitrary amounts of code after, which may have changed its value. T -- GEEK = Gatherer of Extremely Enlightening Knowledge
Feb 23 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 17:46:04 -0500, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Sat, Feb 23, 2013 at 03:15:26PM -0500, Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 11:42:26 -0500, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

- wait():
   - Some code examples would be nice.

   - For the POSIX-specific version, I thought the Posix standard
     specifies that the actual return code / signal number should be
     extracted by means of system-specific macros (in C anyway)?
     Wouldn't it be better to encapsulate this in a POD struct or
     something instead of exposing the implementation-specific values
     to the user?
We handle the extraction as an implementation detail, the result should be cross-platform (at least on signal-using platforms). I don't know what a POD struct would get you, maybe you could elaborate what you mean?
Oh, I thought the return value was just straight from the syscall, which requires WIFEXITED, WEXITSTATUS, WCOREDUMP, etc., to interpret. If it has already been suitably interpreted in std.process, then I guess it's OK.
I think that's the case. (double checking) yes.
 Otherwise, I was thinking of encapsulating these macros in some kind of
 POD struct, that provides methods like .ifExited, .exitStatus,
 .coreDump, etc. so that the user code doesn't have to directly play with
 the exact values returned by the specific OS.
All we look at is WIFEXITED and WIFSIGNALED. I think the others are Linux specific. I don't think std.process2 should expose all the vagaries of the OS it's on, this is a cross-platform library. Doing signals was easy because we embedded it in the int. There is always pid.osHandle, which you can use to do whatever you want.
   - How do I wait for *any* child process to terminate, not just a
     specific Pid?
I don't think we have a method to do that. It would be complex, especially if posix wait() returned a pid that we are not handling! I suppose what you could do is call posix wait (I have a feeling we may need to change our global wait function, or eliminate it), and then map the result back to a Pid you are tracking. You have any ideas how this could be implemented? I'd prefer not to keep a global cache of child process objects...
Why not? On Posix at least, you get SIGCHLD, etc., for all child processes anyway, so a global cache doesn't seem to be out-of-place. But you do have a point about pids that we aren't managing, e.g. if the user code is doing some fork()s on its own. But the way I see it, std.process is supposed to alleviate the need to do such things directly, so in my mind, if everything is going through std.process anyway, might as well just manage all child processes there. OTOH, this may cause problems if the D program links in C/C++ libraries that manage their own child processes.
Well, there is always the possibility of a child creating a child, and exiting, the grandchild then becomes our child. There is no way to predict or plan for that. If we expose the general wait call, then we will be subject to odd cases, and I think at that point, it's a specialized application. We provide a way to get back to OS-specific land via osHandle.
 Still, it would be nice to have some way of waiting for a set of child
 Pids, not just a single one. It would be a pain if user code had to
 manually manage child processes all the time when there's more than one
 of them running at a time.
This is not possible on Linux/OSX (you can't specify the process subset to wait for), but possible on Windows. We chose not to expose that because it's very application specific, and we are trying to write a cross platform library. You can always write a function that does this. Simple example: Pid[int] processes; // create processes, storing them by OS pid int pid; while(pid = .wait()) { Pid *p = processes[pid]; if(p) { // handle child exiting } }
 Hmm. The more I think about it, the more it makes sense to just have
 std.process manage all child process related stuff. It's too painful to
 deal with multiple child processes otherwise. Maybe provide an opt-out
 in case you need to link in some C/C++ libraries that need their own
 child process handling, but the default, IMO, should be to manage
 everything through std.process.
I can imagine that a ProcessManager singleton class could be written that collects all exited children, and does anything you need. But I don't know if it's a necessary component for std.process to go out the door. We currently have no such feature, so I would push for std.process to be reviewed and accepted without that, and then consider that an enhancement request. Now, one thing we could probably do quickly and easily is add a wait function that returns immediately if the process is not done. I'm pretty sure that is supported on all platforms. -Steve
Feb 23 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 06:55:19PM -0500, Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 17:46:04 -0500, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
On Sat, Feb 23, 2013 at 03:15:26PM -0500, Steven Schveighoffer wrote:
[...]
 All we look at is WIFEXITED and WIFSIGNALED.  I think the others are
 Linux specific.  I don't think std.process2 should expose all the
 vagaries of the OS it's on, this is a cross-platform library.  Doing
 signals was easy because we embedded it in the int.
Fair enough, I think WIFEXITED and WIFSIGNALED probably covers 99.5% of the common use cases anyway, so it's probably not worth sweating over. BTW, is "std.process2" just the temporary name, or are we seriously going to put in a "std.process2" into Phobos? I'm hoping the former, as the latter is unforgivably ugly.
 There is always pid.osHandle, which you can use to do whatever you want.
True.
   - How do I wait for *any* child process to terminate, not just a
     specific Pid?
I don't think we have a method to do that. It would be complex, especially if posix wait() returned a pid that we are not handling!
[...]
Hmm. The more I think about it, the more it makes sense to just have
std.process manage all child process related stuff. It's too painful
to deal with multiple child processes otherwise. Maybe provide an
opt-out in case you need to link in some C/C++ libraries that need
their own child process handling, but the default, IMO, should be to
manage everything through std.process.
I can imagine that a ProcessManager singleton class could be written that collects all exited children, and does anything you need. But I don't know if it's a necessary component for std.process to go out the door. We currently have no such feature, so I would push for std.process to be reviewed and accepted without that, and then consider that an enhancement request.
[...] Fair enough, we do want the new std.process to get in ASAP.
 Now, one thing we could probably do quickly and easily is add a wait
 function that returns immediately if the process is not done.  I'm
 pretty sure that is supported on all platforms.
[...] Excellent! I think if we had this, it would address most of my concerns above. A non-blocking wait would allow user code to do things like monitor the progress of child processes, etc., and basically implement whatever OS-specific stuff it may need to, without adding too much complication into std.process. I vote for putting this in, and just leave the handling of multiple child processes to user code. T -- What are you when you run out of Monet? Baroque.
Feb 23 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 00:11:42 UTC, H. S. Teoh wrote:
 BTW, is "std.process2" just the temporary name, or are we 
 seriously
 going to put in a "std.process2" into Phobos? I'm hoping the 
 former, as
 the latter is unforgivably ugly.
I agree, it's not ideal, but "unforgivably ugly" is taking it a bit far. :) Anyway, to be honest, I named it std.process2 because I got tired of merge conflicts whenever someone made changes in Phobos master that either directly or indirectly involved the current std.process. Whether it should finally be named std.process or std.process2 is open for debate, IMO, but I have to admit that I am to an increasing degree starting to understand Walter's point of view on these matters... Lars
Feb 24 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 16:09:43 H. S. Teoh wrote:
 BTW, is "std.process2" just the temporary name, or are we seriously
 going to put in a "std.process2" into Phobos? I'm hoping the former, as
 the latter is unforgivably ugly.
In previous discussions, it was agreed that future replacement modules would simply have a number appended to them like that (e.g. std.xml2 or std.random2). I don't think that that decision is irreversible, but unless someone can come up with a much better name, I'd expect it to stick, and it has the advantage of making it very clear that it's replacing the old one. - Jonathan M Davis
Feb 23 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 19:25:33 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Saturday, February 23, 2013 16:09:43 H. S. Teoh wrote:
 BTW, is "std.process2" just the temporary name, or are we seriously
 going to put in a "std.process2" into Phobos? I'm hoping the former, as
 the latter is unforgivably ugly.
In previous discussions, it was agreed that future replacement modules would simply have a number appended to them like that (e.g. std.xml2 or std.random2). I don't think that that decision is irreversible, but unless someone can come up with a much better name, I'd expect it to stick, and it has the advantage of making it very clear that it's replacing the old one.
Yeah, I don't want to get into this discussion again. There are better ways (at least IMO :), but they were not favored. Once std.process2 is accepted, and in use for a long time, we can probably deprecate std.process. But I don't know if std.process2 would then be renamed. I can't remember what was decided. -Steve
Feb 23 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 19:32:48 Steven Schveighoffer wrote:
 Yeah, I don't want to get into this discussion again.  There are better
 ways (at least IMO :), but they were not favored.
 
 Once std.process2 is accepted, and in use for a long time, we can probably
 deprecate std.process.  But I don't know if std.process2 would then be
 renamed.  I can't remember what was decided.
We might be able to remove std.process eventually and then rename std.process2 to std.process (leaving std.process2.d to import std.process), but Walter (and to some extent Andrei) seems to be very much in favor of leaving stuff around permanently. It's likely that std.process will be deprecated (which now defaults to warning about it rather than giving an error) and eventually undocumented, but actually killing it off may take a bit of doing given Walter's attitude about code breakage. He seems to be perfectly fine with leaving around old, dead code on the off-chance that some older code is using it and would break if it were removed. - Jonathan M Davis
Feb 23 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 20:07:43 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Saturday, February 23, 2013 19:32:48 Steven Schveighoffer wrote:
 Yeah, I don't want to get into this discussion again.  There are better
 ways (at least IMO :), but they were not favored.

 Once std.process2 is accepted, and in use for a long time, we can  
 probably
 deprecate std.process.  But I don't know if std.process2 would then be
 renamed.  I can't remember what was decided.
We might be able to remove std.process eventually and then rename std.process2 to std.process (leaving std.process2.d to import std.process), but Walter (and to some extent Andrei) seems to be very much in favor of leaving stuff around permanently. It's likely that std.process will be deprecated (which now defaults to warning about it rather than giving an error) and eventually undocumented, but actually killing it off may take a bit of doing given Walter's attitude about code breakage. He seems to be perfectly fine with leaving around old, dead code on the off-chance that some older code is using it and would break if it were removed.
I don't see std.date around anymore... -Steve
Feb 23 2013
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 20:14:14 Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 20:07:43 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 On Saturday, February 23, 2013 19:32:48 Steven Schveighoffer wrote:
 Yeah, I don't want to get into this discussion again.  There are better
 ways (at least IMO :), but they were not favored.
 
 Once std.process2 is accepted, and in use for a long time, we can
 probably
 deprecate std.process.  But I don't know if std.process2 would then be
 renamed.  I can't remember what was decided.
We might be able to remove std.process eventually and then rename std.process2 to std.process (leaving std.process2.d to import std.process), but Walter (and to some extent Andrei) seems to be very much in favor of leaving stuff around permanently. It's likely that std.process will be deprecated (which now defaults to warning about it rather than giving an error) and eventually undocumented, but actually killing it off may take a bit of doing given Walter's attitude about code breakage. He seems to be perfectly fine with leaving around old, dead code on the off-chance that some older code is using it and would break if it were removed.
I don't see std.date around anymore...
Yes. I killed it, but Walter has never liked that sort of thing and has been increasingly outspoken about it, and Andrei seems to be jumping on that bandwagon. For instance, IIRC, they both griped about actually removing the deprecated functions from std.string. I'd _very_ much like to get rid of them outright, since they clutter the code and actually were generating errors when used until recently (since they were deprecated before the changes to deprecated). Keeping them around is just plain harmful IMHO, and I may yet manage to kill them off, but they don't seem to like the idea, and I fully expect a similar attitude towards something like std.process. Unfortunately, while replacing old solutions with new, better solutions seems to be fine, it doesn't seem to be okay to actually get rid of the old ones anymore. - Jonathan M Davis
Feb 23 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 05:32:08PM -0800, Jonathan M Davis wrote:
 On Saturday, February 23, 2013 20:14:14 Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 20:07:43 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 wrote:
[...]
 We might be able to remove std.process eventually and then rename
 std.process2 to std.process (leaving std.process2.d to import
 std.process), but Walter (and to some extent Andrei) seems to be
 very much in favor of leaving stuff around permanently. It's
 likely that std.process will be deprecated (which now defaults to
 warning about it rather than giving an error) and eventually
 undocumented, but actually killing it off may take a bit of doing
 given Walter's attitude about code breakage. He seems to be
 perfectly fine with leaving around old, dead code on the
 off-chance that some older code is using it and would break if it
 were removed.
I don't see std.date around anymore...
Yes. I killed it, but Walter has never liked that sort of thing and has been increasingly outspoken about it, and Andrei seems to be jumping on that bandwagon. For instance, IIRC, they both griped about actually removing the deprecated functions from std.string. I'd _very_ much like to get rid of them outright, since they clutter the code and actually were generating errors when used until recently (since they were deprecated before the changes to deprecated). Keeping them around is just plain harmful IMHO, and I may yet manage to kill them off, but they don't seem to like the idea, and I fully expect a similar attitude towards something like std.process. Unfortunately, while replacing old solutions with new, better solutions seems to be fine, it doesn't seem to be okay to actually get rid of the old ones anymore.
[...] Well, std.regexp got lucky, in that the new module has a subtly different name std.regex, so we can just eventually stop documenting std.regexp but leave it in the codebase for whatever old code that uses it to continue working. Ditto with the upcoming std.io to replace std.stdio. But I can't think of any better name for the new std.process. :-( I'd suggest std.proc, but I'm pretty sure it will get shot down as it's too short and ambiguous (it could be misinterpreted as std.procedure for example). Alternatively, I would push for renaming the old std.process to something like old.process (or something else), which is much less of a breakage than deleting it from Phobos outright -- existing code just need to have their imports fixed and will continue working, whereas deleting the module outright leaves existing code with no recourse but to potentially rewrite from scratch. This may be easier to convince Walter & Andrei on, than outright killing old deprecated modules. T -- There's light at the end of the tunnel. It's the oncoming train.
Feb 23 2013
parent 1100110 <0b1100110 gmail.com> writes:
On 02/23/2013 08:39 PM, H. S. Teoh wrote:
 On Sat, Feb 23, 2013 at 05:32:08PM -0800, Jonathan M Davis wrote:
 On Saturday, February 23, 2013 20:14:14 Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 20:07:43 -0500, Jonathan M Davis<jmdavisProg gmx.com>
 wrote:
[...]
 We might be able to remove std.process eventually and then rename
 std.process2 to std.process (leaving std.process2.d to import
 std.process), but Walter (and to some extent Andrei) seems to be
 very much in favor of leaving stuff around permanently. It's
 likely that std.process will be deprecated (which now defaults to
 warning about it rather than giving an error) and eventually
 undocumented, but actually killing it off may take a bit of doing
 given Walter's attitude about code breakage. He seems to be
 perfectly fine with leaving around old, dead code on the
 off-chance that some older code is using it and would break if it
 were removed.
I don't see std.date around anymore...
Yes. I killed it, but Walter has never liked that sort of thing and has been increasingly outspoken about it, and Andrei seems to be jumping on that bandwagon. For instance, IIRC, they both griped about actually removing the deprecated functions from std.string. I'd _very_ much like to get rid of them outright, since they clutter the code and actually were generating errors when used until recently (since they were deprecated before the changes to deprecated). Keeping them around is just plain harmful IMHO, and I may yet manage to kill them off, but they don't seem to like the idea, and I fully expect a similar attitude towards something like std.process. Unfortunately, while replacing old solutions with new, better solutions seems to be fine, it doesn't seem to be okay to actually get rid of the old ones anymore.
[...] Well, std.regexp got lucky, in that the new module has a subtly different name std.regex, so we can just eventually stop documenting std.regexp but leave it in the codebase for whatever old code that uses it to continue working. Ditto with the upcoming std.io to replace std.stdio. But I can't think of any better name for the new std.process. :-( I'd suggest std.proc, but I'm pretty sure it will get shot down as it's too short and ambiguous (it could be misinterpreted as std.procedure for example). Alternatively, I would push for renaming the old std.process to something like old.process (or something else), which is much less of a breakage than deleting it from Phobos outright -- existing code just need to have their imports fixed and will continue working, whereas deleting the module outright leaves existing code with no recourse but to potentially rewrite from scratch. This may be easier to convince Walter& Andrei on, than outright killing old deprecated modules. T
+1
Feb 24 2013
prev sibling parent "js.mdnq" <js_adddot+mdng gmail.com> writes:
On Sunday, 24 February 2013 at 00:25:46 UTC, Jonathan M Davis 
wrote:
 On Saturday, February 23, 2013 16:09:43 H. S. Teoh wrote:
 BTW, is "std.process2" just the temporary name, or are we 
 seriously
 going to put in a "std.process2" into Phobos? I'm hoping the 
 former, as
 the latter is unforgivably ugly.
In previous discussions, it was agreed that future replacement modules would simply have a number appended to them like that (e.g. std.xml2 or std.random2). I don't think that that decision is irreversible, but unless someone can come up with a much better name, I'd expect it to stick, and it has the advantage of making it very clear that it's replacing the old one. - Jonathan M Davis
That is a really really really bad idea! There are much better versioning methods out there. import "std.process"; // uses latest version by default or file name exact match import "std.process"[3 > ver >= 2]; or import "std.process"[hash == hashid]; would be a better way. Module file names could have attached versioning info similar to how MS does it. process.hash.versionid.otherattributes The attributes are matched only if they are used, else ignored. Hence process.hash.version.otherattributes.d process.d would be logically identical(and throw an error if in the same dir and no attribute matching used) but one could specify attribute matching to narrow down the choice. (or the latest version could be used by default and a warning thrown able multiple choices) This allows one to keep multiple versions of the same module name in the same dir. It helps with upgrading because you can easily switch modules(one could set global matches instead of per module). One could also have the compiler attach the latest match on the import so each compilation uses the latest version but it does not have to be specified by the user. When the user distributed the code it will have the proper matching elements in the code. e.g., import "std.process"[auto]; // auto will be replaced by the compiler with the appropriate matching attributes. Possibly better to specify through a command line arg instead. Another thing one can do to help is to have the compiler automatically modify the source code to include what module it was compiled with by hash and or version. When the code is recompiled a warning can be given that a different version was used.
Mar 05 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 04:25:33PM -0800, Jonathan M Davis wrote:
 On Saturday, February 23, 2013 16:09:43 H. S. Teoh wrote:
 BTW, is "std.process2" just the temporary name, or are we seriously
 going to put in a "std.process2" into Phobos? I'm hoping the former,
 as the latter is unforgivably ugly.
In previous discussions, it was agreed that future replacement modules would simply have a number appended to them like that (e.g. std.xml2 or std.random2). I don't think that that decision is irreversible, but unless someone can come up with a much better name, I'd expect it to stick, and it has the advantage of making it very clear that it's replacing the old one.
[...] Ugh. I don't like this. I can see where it's coming from, and why it's necessary (to avoid breaking tons of code relying on the old API), but I really don't like it. It leads to the ugly situation of code that relies on both std.xyz7 and std.xyz19, just because parts of the code were written at different times, and it's too much work to clean up, which leaves people who read the code having to remember how version 19 of xyz differed from version 7. Unless we go through a deprecation process (no pun intended) where std.process2 is a temporary name until the old std.process is phased out, then std.process2 is renamed to std.process (perhaps leaving a wrapper public import in std.process2). I would really hate to see Phobos deteriorate into a situation of std.algorithm5, std.io7, std.regex4, std.process3, std.range5, where nobody can remember which version of which module is the most current without looking it up, just because we have to keep all the old names for backwards-compatibility. T -- Why do conspiracy theories always come from the same people??
Feb 23 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 19:44:29 -0500, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Sat, Feb 23, 2013 at 04:25:33PM -0800, Jonathan M Davis wrote:
 On Saturday, February 23, 2013 16:09:43 H. S. Teoh wrote:
 BTW, is "std.process2" just the temporary name, or are we seriously
 going to put in a "std.process2" into Phobos? I'm hoping the former,
 as the latter is unforgivably ugly.
In previous discussions, it was agreed that future replacement modules would simply have a number appended to them like that (e.g. std.xml2 or std.random2). I don't think that that decision is irreversible, but unless someone can come up with a much better name, I'd expect it to stick, and it has the advantage of making it very clear that it's replacing the old one.
[...] Ugh. I don't like this. I can see where it's coming from, and why it's necessary (to avoid breaking tons of code relying on the old API), but I really don't like it. It leads to the ugly situation of code that relies on both std.xyz7 and std.xyz19, just because parts of the code were written at different times, and it's too much work to clean up, which leaves people who read the code having to remember how version 19 of xyz differed from version 7. Unless we go through a deprecation process (no pun intended) where std.process2 is a temporary name until the old std.process is phased out, then std.process2 is renamed to std.process (perhaps leaving a wrapper public import in std.process2). I would really hate to see Phobos deteriorate into a situation of std.algorithm5, std.io7, std.regex4, std.process3, std.range5, where nobody can remember which version of which module is the most current without looking it up, just because we have to keep all the old names for backwards-compatibility.
I'm not sure this would happen. In order for a module to be "renumbered", it needs to be a complete rewrite from scratch, with no common ancestry. If we have anything that goes to 3, something is wrong. We should not be approving new designs if we plan to get rid of them later. There are several modules in Phobos that are/were so bad they need to be rewritten from scratch. std.process, std.xml are the two that come off the top of my head. These were written back when Phobos was very young and foolish, and accepted any old design that a random programmer came up with. std.range, std.algorithm, std.regex are all pretty safe, they aren't going to be rewritten unless something catastrophic is discovered in them that invalidates their entire design. In fact, I think all of these were rewritten already. So I think we just have to deal with it. Just think, you will know some trivia when you are teaching D to some young developer someday :) -Steve
Feb 23 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 07:59:45PM -0500, Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 19:44:29 -0500, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
[...]
I would really hate to see Phobos deteriorate into a situation of
std.algorithm5, std.io7, std.regex4, std.process3, std.range5, where
nobody can remember which version of which module is the most current
without looking it up, just because we have to keep all the old names
for backwards-compatibility.
I'm not sure this would happen. In order for a module to be "renumbered", it needs to be a complete rewrite from scratch, with no common ancestry. If we have anything that goes to 3, something is wrong. We should not be approving new designs if we plan to get rid of them later. There are several modules in Phobos that are/were so bad they need to be rewritten from scratch. std.process, std.xml are the two that come off the top of my head. These were written back when Phobos was very young and foolish, and accepted any old design that a random programmer came up with. std.range, std.algorithm, std.regex are all pretty safe, they aren't going to be rewritten unless something catastrophic is discovered in them that invalidates their entire design. In fact, I think all of these were rewritten already. So I think we just have to deal with it. Just think, you will know some trivia when you are teaching D to some young developer someday :)
[...] Well, in this case, I'd push for a deprecation path for the badly designed modules, so that the current std.process will eventually be phased out and replaced with std.process2, and then we can rename std.process2 to std.process (and leave a public import in std.process2 to maintain compatibility with current code). I really do not like the name "std.process2". It is exactly the kind of thing that will cause newbies to avoid it and go back to the old badly designed std.process, and then come back to complain about it, then when told to use std.process2, they will wonder "why the 2?". It's just ugly. OTOH, if we're going to be reorganizing the Phobos module hierarchy, then that may be a good time to get the new std.process into the right name, and leave the old one somewhere else (maybe remain as std.process if the new one goes somewhere else in the hierarchy). T -- VI = Visual Irritation
Feb 23 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 21:30:57 -0500, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 I really do not like the name "std.process2". It is exactly the kind of
 thing that will cause newbies to avoid it and go back to the old badly
 designed std.process, and then come back to complain about it, then when
 told to use std.process2, they will wonder "why the 2?". It's just ugly.
Well, we can make importing std.process uncomfortable (read: print a warning whenever you include it), and it won't be in the docs. If anything, I would think a newbie would just wonder why the 2. And then use it :) I'm not saying I think it's the best situation, but I'm not in charge here...
 OTOH, if we're going to be reorganizing the Phobos module hierarchy,
 then that may be a good time to get the new std.process into the right
 name, and leave the old one somewhere else (maybe remain as
 std.process if the new one goes somewhere else in the hierarchy).
What? AFAIK, this is not in the plan. If Walter and Andrei are willing to reorganize the whole tree, but have a problem with renaming std.process to std.oldprocess, I feel that's pretty inconsistent... -Steve
Feb 23 2013
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 22:08:44 Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 21:30:57 -0500, H. S. Teoh <hsteoh quickfur.ath.cx>
 OTOH, if we're going to be reorganizing the Phobos module hierarchy,
 then that may be a good time to get the new std.process into the right
 name, and leave the old one somewhere else (maybe remain as
 std.process if the new one goes somewhere else in the hierarchy).
What? AFAIK, this is not in the plan. If Walter and Andrei are willing to reorganize the whole tree, but have a problem with renaming std.process to std.oldprocess, I feel that's pretty inconsistent...
Any reorganization that occurred would involve leaving the old modules around but have them simply importing the new ones. But there are no definitive plans to rearrange any modules at this point. Some folks complain about how flat Phobos' hierarchy is from time to time, and recently, Don started a discussion on possibly rearranging it, but nothing has been decided. While it makes some sense to have a deeper hierarchy for newer stuff where appropriate, I'm not at all convinced it's worth the churn of moving any of the old stuff around (even if the equivalent of aliases are left around for the old modules). But we'll see what happens. I expect that nothing will change though, if nothing else, because there hasn't been a big push to change anything, just a few folks complaining about it from time to time. - Jonathan M Davis
Feb 23 2013
prev sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/24/13, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 I'm not sure this would happen.  In order for a module to be "renumbered",

 it needs to be a complete rewrite from scratch, with no common ancestry.
 If we have anything that goes to 3, something is wrong.  We should not be
 approving new designs if we plan to get rid of them later.
Ah but how can you guarantee we won't ever need a 3rd rewrite? It's always possible we might need one in the future. It's also a problem if we have to start remembering version numbers for each module we import. E.g.: import std.process2; import std.xml; // oops, did I mean xml2 maybe? import std.signals2; It's going to be annoying using Phobos like that. I was going to suggest using version flags, but even that could be annoying, although that feature was practically invented for this kind of problem.
Feb 23 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 21:46:32 -0500, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 On 2/24/13, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 I'm not sure this would happen.  In order for a module to be  
 "renumbered",

 it needs to be a complete rewrite from scratch, with no common ancestry.
 If we have anything that goes to 3, something is wrong.  We should not  
 be
 approving new designs if we plan to get rid of them later.
Ah but how can you guarantee we won't ever need a 3rd rewrite? It's always possible we might need one in the future.
I can't *guarantee* it, but I think it's counter-productive to keep ripping apart standard library designs and recreating them. I think after this many years, we should have come up with a design that works well. That is the point of having it reviewed by all you smart people out there! If there is wide consensus that std.process2 does not have a good API, or even a large division of opinions, we may need to rethink the API. I hope this is not the case! Note, a rewrite of the implementation does not require a rename. It's the API which is critical to get right the first (or second) time.
 It's also a problem if we have to start remembering version numbers
 for each module we import. E.g.:

 import std.process2;
 import std.xml; // oops, did I mean xml2 maybe?
This will print a warning to import std.xml2 instead.
 It's going to be annoying using Phobos like that. I was going to
 suggest using version flags, but even that could be annoying, although
 that feature was practically invented for this kind of problem.
No, version is mutually exclusive. You can't have all of phobos depending on std.process version 2, but you want to use std.process version 1. Both would have to be compiled in. I don't think you can do that with version statements. My recommendation would be to have an 'old' directory, and if people want to use the old process, use old.process. We have a nice module system, I think we should use it! Oh shit, I said I didn't want to get back into this debate again... a lot of good that did... Forget everything I said... -Steve
Feb 23 2013
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/24/13, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 No, version is mutually exclusive.  You can't have all of phobos depending
 on std.process version 2, but you want to use std.process version 1.  Both
 would have to be compiled in.  I don't think you can do that with version
 statements.
I meant something like this: std\process1.d -- old process std\process2.d -- new process std\process.d: ---------- version(OldProcess) public import std.process1; else public import std.process2; ---------- Phobos modules which already use std.process would have to be changed to directly import std.process1 or std.process2. Old user code which wants to still compile using the old process would have to add the -version=OldProcess switch /or/ change import statements to 'import std.process1'. If the old code wants to use both it can simply import 'std.process1' or 'std.process2' as needed. New code would simply import std.process and use the new code without having to fiddle with anything.
Feb 23 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be changed
 to directly import std.process1 or std.process2.
This is problematic as has been discussed. I think we could address immediate needs by attaching an attribute to import, e.g.: "v2.070+" import std.process; or similar. By default code would import the old library. Andrei
Feb 24 2013
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 24, 2013 10:05:01 Andrei Alexandrescu wrote:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be changed
 to directly import std.process1 or std.process2.
This is problematic as has been discussed. I think we could address immediate needs by attaching an attribute to import, e.g.: "v2.070+" import std.process; or similar. By default code would import the old library.
An interesting idea, but someone would have to implement it, which could delay adding the new std.process to Phobos. Of course, if we take approach of putting new modules in a different place initially as we've occasionally discussed (e.g. experimental.process or future.process) in order to make sure that they're fully ironed out from actual, widespread, real-world use before freezing their APIs, then that would give us more time to implement such a feature before moving the module to std.process. - Jonathan M Davis
Feb 24 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Feb-2013 12:05, Andrei Alexandrescu пишет:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be changed
 to directly import std.process1 or std.process2.
This is problematic as has been discussed. I think we could address immediate needs by attaching an attribute to import, e.g.: "v2.070+" import std.process; or similar. By default code would import the old library.
The same could be achieved by simply using old version of compiler+druntime+phobos to compile old project. I don't get the desire to keep old junk forever. A year or two - maybe. More then this is just insane.
 Andrei
-- Dmitry Olshansky
Feb 24 2013
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 24, 2013 13:06:00 Dmitry Olshansky wrote:
 24-Feb-2013 12:05, Andrei Alexandrescu =D0=BF=D0=B8=D1=88=D0=B5=D1=82=
:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be chan=
ged
 to directly import std.process1 or std.process2.
=20 This is problematic as has been discussed. I think we could address=
 immediate needs by attaching an attribute to import, e.g.:
=20
  "v2.070+" import std.process;
=20
 or similar. By default code would import the old library.
=20 The same could be achieved by simply using old version of compiler+druntime+phobos to compile old project. =20 I don't get the desire to keep old junk forever. A year or two - mayb=
e.
 More then this is just insane.
I agree, but it _is_ more than a question of keeping old junk around in= this=20 case. We need a we to transition cleanly, and the only way at present t= hat=20 that means not breaking code is to put the new std.process somewhere ot= her=20 than std.process, since if we put it in std.process, it would mean brea= king a=20 lot of code which uses the current std.process, forcing everyone to sti= ck with=20 the old compiler until they'd updated their code. And we don't want tha= t. Whether the old std.process sticks around for more than a year or two i= s=20 therefore a separate matter from what we name the new std.process unles= s we=20 add a feature like Andrei is suggesting. - Jonathan M Davis
Feb 24 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Feb-2013 13:17, Jonathan M Davis пишет:
 On Sunday, February 24, 2013 13:06:00 Dmitry Olshansky wrote:
 24-Feb-2013 12:05, Andrei Alexandrescu пишет:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be changed
 to directly import std.process1 or std.process2.
This is problematic as has been discussed. I think we could address immediate needs by attaching an attribute to import, e.g.: "v2.070+" import std.process; or similar. By default code would import the old library.
The same could be achieved by simply using old version of compiler+druntime+phobos to compile old project. I don't get the desire to keep old junk forever. A year or two - maybe. More then this is just insane.
I agree, but it _is_ more than a question of keeping old junk around in this case. We need a we to transition cleanly, and the only way at present that that means not breaking code is to put the new std.process somewhere other than std.process, since if we put it in std.process, it would mean breaking a lot of code which uses the current std.process, forcing everyone to stick with the old compiler until they'd updated their code. And we don't want that.
I suppose it's easy to translate any active project (as in maintained) to the new std.process, as it's supposed to have easier/richer interface. Then any old cruft that just works and there is no need to touch it can just use the older version of _compiler_. In any case the phrase "old code that needs to use new std.process" is kind of perplexing.
 Whether the old std.process sticks around for more than a year or two is
 therefore a separate matter from what we name the new std.process unless we
 add a feature like Andrei is suggesting.
Maybe I'm missing something but I don't see this feature solving anything yet. -- Dmitry Olshansky
Feb 24 2013
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, February 24, 2013 13:35:31 Dmitry Olshansky wrote:
 24-Feb-2013 13:17, Jonathan M Davis =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 On Sunday, February 24, 2013 13:06:00 Dmitry Olshansky wrote:
 24-Feb-2013 12:05, Andrei Alexandrescu =D0=BF=D0=B8=D1=88=D0=B5=D1=
=82:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be ch=
anged
 to directly import std.process1 or std.process2.
=20 This is problematic as has been discussed. I think we could addre=
ss
 immediate needs by attaching an attribute to import, e.g.:
=20
  "v2.070+" import std.process;
=20
 or similar. By default code would import the old library.
=20 The same could be achieved by simply using old version of compiler+druntime+phobos to compile old project. =20 I don't get the desire to keep old junk forever. A year or two - m=
aybe.
 More then this is just insane.
=20 I agree, but it _is_ more than a question of keeping old junk aroun=
d in
 this case. We need a we to transition cleanly, and the only way at
 present that that means not breaking code is to put the new std.pro=
cess
 somewhere other than std.process, since if we put it in std.process=
, it
 would mean breaking a lot of code which uses the current std.proces=
s,
 forcing everyone to stick with the old compiler until they'd update=
d
 their code. And we don't want that.
I suppose it's easy to translate any active project (as in maintained=
)
 to the new std.process, as it's supposed to have easier/richer interf=
ace.
=20
 Then any old cruft that just works and there is no need to touch it c=
an
 just use the older version of _compiler_. In any case the phrase "old=
 code that needs to use new std.process" is kind of perplexing.
It's the fact that you're forced to update your code when you update th= e=20 compiler rather than giving you time to update it which is the problem.= I'm=20 not completely against the idea of saying that we have the old std.proc= ess in=20 one version and the new std.process in another - especially if we had m= ore of=20 a major-minor versioning scheme where it was expected that code would b= reak at=20 major version changes, but we don't really have that right now, and I r= eally=20 don't think that Walter would go for it anyway. Long term, code should definitely be ported over to the new std.process= , and=20 it's probably not all that big a deal in this case (though other module= s in=20 similar situations could be a much bigger deal - e.g. std.date -> std.d= atetime=20 was a very large change), but the question is whether it's okay to forc= e=20 people to change their code immediately. If it's not (and given Walter'= s=20 attitude and our current versioning scheme, I don't think that it is), = then we=20 need to come up with a new name for the new std.process module rather t= han=20 replacing it in-place. But it _is_ this sort of change which would make= it=20 desirable to adjust our versioning scheme.
 Whether the old std.process sticks around for more than a year or t=
wo is
 therefore a separate matter from what we name the new std.process u=
nless
 we
 add a feature like Andrei is suggesting.
=20 Maybe I'm missing something but I don't see this feature solving anything yet.
It makes it so that you can have both the old and new std.process be na= me=20 std.process. However, given that "v2.064+" or whatever would have to b= e there=20 to use the new std.process and the fact that you'd presumably would wan= t to=20 remove it once the old std.process was actually gone, it's ultimately n= ot all=20 that different from naming it std.process2 and then renaming it to std.= process=20 later. It's just an attribute rather than a 2 at the end of its name. - Jonathan M Davis
Feb 24 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/24/13 11:06 AM, Dmitry Olshansky wrote:
 24-Feb-2013 12:05, Andrei Alexandrescu пишет:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
 Phobos modules which already use std.process would have to be changed
 to directly import std.process1 or std.process2.
This is problematic as has been discussed. I think we could address immediate needs by attaching an attribute to import, e.g.: "v2.070+" import std.process; or similar. By default code would import the old library.
The same could be achieved by simply using old version of compiler+druntime+phobos to compile old project.
That's quite different I'd say. Andrei
Feb 24 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Feb 24, 2013 at 10:05:01AM +0200, Andrei Alexandrescu wrote:
 On 2/24/13 6:26 AM, Andrej Mitrovic wrote:
Phobos modules which already use std.process would have to be changed
to directly import std.process1 or std.process2.
This is problematic as has been discussed. I think we could address immediate needs by attaching an attribute to import, e.g.: "v2.070+" import std.process;
Better yet, ">v.2070". So that later on it can be extended to ">v2.070 <v2.084", etc.. But I don't know if it's a good idea to push it that far...
 or similar. By default code would import the old library.
[...] Alternatively, use a version identifier: version = newStdProcess; import std.process; // get new version ----- //version = newStdProcess; import std.process; // get old version Then once the old version has gone through the deprecation cycle and is kicked out, the new code can just ignore version=newStdProcess and always import the new version, and existing user code needs no changes. T -- Why ask rhetorical questions? -- JC
Feb 24 2013
next sibling parent reply "Chris Nicholson-Sauls" <ibisbasenji gmail.com> writes:
On Sunday, 24 February 2013 at 19:45:26 UTC, H. S. Teoh wrote:
 Alternatively, use a version identifier:

 	version = newStdProcess;
 	import std.process;	// get new version
 	-----
 	//version = newStdProcess;
 	import std.process;	// get old version

 Then once the old version has gone through the deprecation 
 cycle and is
 kicked out, the new code can just ignore version=newStdProcess 
 and
 always import the new version, and existing user code needs no 
 changes.
Would work just fine, *if* versions propogated across modules, which they do not. (And doing so creates new problems discussed to death and beyond in times past.) But maybe you meant setting it at the command-line, which to be honest, was my first thought as well. The ensuing discussion baffles me. How about this as a suggestion, even though I know it will never happen. Release A: - New process module is available as 'future.process' - Old module remains available as 'std.process' but with a pragma(msg) warning users that it will go away next release. (It'd be even better to have a pragma(warn), actually.) - Old module is duplicated as 'past.process' Release B: - New module is now 'std.process', but with a pragma(msg) reminding users to update code if they haven't already - Old module remains at 'past.process' Release C: - New module remains at 'std.process' now with no special messages. - Old module remains at 'past.process' for the last time. Ta, fricking, da. We should have started a procedure like this ages ago for Phobos additions and rewrites. Not everyone is in a position to comfortably use pre-release compilers, so testing new code from git head is not an option for them. Further, this gives old code three whole releases before it's forced to update; that ought to be enough legacy support. Legacy support is the devil.
Feb 24 2013
parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/24/13, Chris Nicholson-Sauls <ibisbasenji gmail.com> wrote:
   - Old module remains available as 'std.process' but with a
 pragma(msg) warning users that it will go away next release.
 (It'd be even better to have a pragma(warn), actually.)
We have deprecated("message") for that. And it gives the user the option to silence the deprecation message via -d.
Feb 24 2013
prev sibling parent Lee Braiden <leebraid gmail.com> writes:
On Sun, 24 Feb 2013 11:43:24 -0800, H. S. Teoh wrote:
  "v2.070+" import std.process;
 
 Better yet,  ">v.2070". So that later on it can be extended to  ">v2.070
 <v2.084", etc.. But I don't know if it's a good idea to push it that
 far...
 
Now that I think about it more, this is not an import-level issue. It's a package management/build-tool issue. If your package manager knows that you depend on phobos >= 2 && phobos <= 3, and can install them in a local build dir, when the global version doesn't match, then all of these problems go away. You also gain a lot from that: package managers know about inter-package dependencies; much more complex dependencies (even dependencies unknown to the author of a program at the time) can be handled automatically; there's no need to worry about whether std.process is version 1 or 2, or even if it's the one that originally came with phobos2, etc. -- Lee
Feb 24 2013
prev sibling parent Lee Braiden <leebraid gmail.com> writes:
On Sun, 24 Feb 2013 03:46:32 +0100, Andrej Mitrovic wrote:

 Ah but how can you guarantee we won't ever need a 3rd rewrite? It's
always possible we might need one in the future.
I think you want to guarantee that it CAN be re-written, because existing code always becomes crufty, and even less wise as time goes on, as new requirements become obvious.
 It's also a problem if we have to start remembering version numbers for
 each module we import. E.g.:
 
 import std.process2;
 import std.xml; // oops, did I mean xml2 maybe?
 import std.signals2;
 
 It's going to be annoying using Phobos like that. I was going to suggest
 using version flags, but even that could be annoying, although that
 feature was practically invented for this kind of problem.
Agreed. But I think the problem is that we're talking about changing individual modules' versions, within what should be a stable D API. I would much rather see something like D1 = Phobos1, D2 = Phobos2, and some flag at the top of a file, like d_version(2), to say which version your code expects/requires. The compiler could then say one of: * OK, I'm version 3, but I'll give you the v2 phobos --- probably through library code like "version(>= d3) { ... } else { ... }" * Sorry, I'm version 2, and you need version 3. You need a newer compiler. * Potentially, it could say, "Sorry, I'm D version 6, and only support versions since D4. To compile D1-D3 code, you need an older compiler." But I don't think this would be wise. Do D compilers currently warn about using deprecated features? If not, that would be a useful addition from this kind of version specification, too. The main benefit of this is that, within D2, there would be a truly STABLE API. You'd know whether you should go with Phobos's built-in stream library, say, or some third-party library, because there wouldn't be a new rewrite coming next week. Either you're writing for D2, or you're writing for D3. If D2 doesn't have it, then you find a third- party lib that does. If D3 has it, and you want that, then you check out the bleeding edge D3 compiler / libs, and hope it doesn't blow up. -- Lee
Feb 24 2013
prev sibling parent "Jakob Bornecrantz" <wallbraker gmail.com> writes:
On Saturday, 23 February 2013 at 22:48:04 UTC, H. S. Teoh wrote:
 On Sat, Feb 23, 2013 at 03:15:26PM -0500, Steven Schveighoffer 
 wrote:
 On Sat, 23 Feb 2013 11:42:26 -0500, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:
 
- wait():
   - Some code examples would be nice.

   - For the POSIX-specific version, I thought the Posix
     standard specifies that the actual return code /
     signal number should be extracted by means of
     system-specific macros (in C anyway)?
     Wouldn't it be better to encapsulate this in a POD
     struct or something instead of exposing the
     implementation-specific values to the user?
We handle the extraction as an implementation detail, the result should be cross-platform (at least on signal-using platforms). I don't know what a POD struct would get you, maybe you could elaborate what you mean?
Oh, I thought the return value was just straight from the syscall, which requires WIFEXITED, WEXITSTATUS, WCOREDUMP, etc., to interpret. If it has already been suitably interpreted in std.process, then I guess it's OK. Otherwise, I was thinking of encapsulating these macros in some kind of POD struct, that provides methods like .ifExited, .exitStatus, .coreDump, etc. so that the user code doesn't have to directly play with the exact values returned by the specific OS.
   - How do I wait for *any* child process to terminate, not
     just a specific Pid?
I don't think we have a method to do that. It would be complex, especially if posix wait() returned a pid that we are not handling! I suppose what you could do is call posix wait (I have a feeling we may need to change our global wait function, or eliminate it), and then map the result back to a Pid you are tracking. You have any ideas how this could be implemented? I'd prefer not to keep a global cache of child process objects...
Btw on windows a simple array with the hProcesses is all you need to do this, the code below uses AA for reverse lookup, really simple stuff; https://github.com/Wallbraker/Unicorn/blob/master/src/uni/util/cmd.d#L354 The code was written before before I know of the new std.process, but it solved exactly this problem.
 Why not?  On Posix at least, you get SIGCHLD, etc., for all 
 child processes anyway, so a global cache doesn't seem to be 
 out-of-place.

 But you do have a point about pids that we aren't managing, 
 e.g. if the user code is doing some fork()s on its own. But
 the way I see it, std.process is supposed to alleviate the
 need to do such things directly, so in my mind, if everything
 is going through std.process anyway, might as well just manage
 all child processes there. OTOH, this may cause problems if the
 D program links in C/C++ libraries that manage their own chil
 processes.
But as you state above, it only works for a single CmdGroup on posix and will probably interact badly with code using wait. I never got so far because the code works for my limited case, but supposedly process groups might help out, but the seem to interfere with signals. More reading is required. http://linux.die.net/man/7/credentials
 Still, it would be nice to have some way of waiting for a set 
 of child Pids, not just a single one. It would be a pain if
 user code had to manually manage child processes all the time
 when there's more than one of them running at a time.

 Hmm. The more I think about it, the more it makes sense to just 
 have std.process manage all child process related stuff. It's
 too painful to deal with multiple child processes otherwise.
 Maybe provide an opt-out in case you need to link in some
 C/C++ libraries that need their own child process handling, but
 the default, IMO, should be to manage everything through
 std.process.
This needs to be opt-in, because this stuff should not break libraries using fork/wait by default. Cheers, Jakob.
Feb 23 2013
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 14:46:04 H. S. Teoh wrote:
 I never liked the design of errno in C... its being a global makes
 keeping track of errors a pain. It would be nice if the value of errno
 were saved in the Exception object at the time the error was
 encountered, instead of arbitrary amounts of code after, which may have
 changed its value.
That's what std.exception.errnoEnforce and ErrnoException doe. The ErrnoException gets the error code when it's constructed. std.file.FileException does the same thing at least some of the time (depending on what caused the error). In the case of FileException, we should probably have subclasses for the main error codes that you might want to handle specially (so catching the explicitly would be useful), but we don't have that yet. I don't know what the new std.process is doing (I haven't look at it yet), but if it's throwing exceptions based on errno, it needs to at least put the error code in the exception and maybe have specific exception types if it would make sense to be catching exceptions from std.process based on what exactly went wrong. Get the value of errno after catching the exception is just asking for it, since who knows what code ran after the exception was originally thrown. - Jonathan M Davis
Feb 23 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 18:59:05 -0500, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 I don't know what the new std.process is doing (I haven't look at it  
 yet), but
 if it's throwing exceptions based on errno, it needs to at least put the  
 error
 code in the exception and maybe have specific exception types if it  
 would make
 sense to be catching exceptions from std.process based on what exactly  
 went
 wrong. Get the value of errno after catching the exception is just  
 asking for
 it, since who knows what code ran after the exception was originally  
 thrown.
It uses strerror to get the errno string representation, and uses that as the message. I think it should also save the error code. In a past life, I had a SystemException type (this was C++), and anything that died because of an errno error would throw a derivative of that, containing an errno copy. If we already have a base ErrnoException, we ProcessException probably should derive from that. -Steve
Feb 23 2013
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 19:21:26 Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 18:59:05 -0500, Jonathan M Davis <jmdavisProg gmx.com>
 
 wrote:
 I don't know what the new std.process is doing (I haven't look at it
 yet), but
 if it's throwing exceptions based on errno, it needs to at least put the
 error
 code in the exception and maybe have specific exception types if it
 would make
 sense to be catching exceptions from std.process based on what exactly
 went
 wrong. Get the value of errno after catching the exception is just
 asking for
 it, since who knows what code ran after the exception was originally
 thrown.
It uses strerror to get the errno string representation, and uses that as the message. I think it should also save the error code. In a past life, I had a SystemException type (this was C++), and anything that died because of an errno error would throw a derivative of that, containing an errno copy. If we already have a base ErrnoException, we ProcessException probably should derive from that.
Possibly, but you have to be a bit careful with that. For instance, std.file.FileException is in a weird place, because sometimes a FileException comes from errno and sometimes not, which means that if it were made to derive from ErrnoException, it could be a problem. ProcessException may not have that problem, but that sort of issue should be considered when deciding its inheritance hierarchy. If we had multiple inheritance, it would be an easier decision, but we don't. But an minimum, there should be a way to access the error code from errno if that's where an exception originated from (even if it's just that it's an exception that has ErrnoException as its next). - Jonathan M Davis
Feb 23 2013
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
I apologise if this gets posted twice, but my newsgroup client is 
acting up.

On Saturday, 23 February 2013 at 16:44:30 UTC, H. S. Teoh wrote:
 I just looked over the docs. Looks very good!
Thanks! :)
 Just a few minor comments:

 - wait():
    - Some code examples would be nice.
The wait() documentation says "Examples: See the spawnProcess documentation", and provides a link. I didn't see any point in duplicating the examples there just for the sake of it.
    - For the POSIX-specific version, I thought the Posix 
 standard
      specifies that the actual return code / signal number 
 should be
      extracted by means of system-specific macros (in C anyway)?
      Wouldn't it be better to encapsulate this in a POD struct 
 or
      something instead of exposing the implementation-specific 
 values to
      the user?
Steve already answered this, but yes, the number returned by wait() has already been processed by these macros. wait() therefore has the same result on all POSIX systems.
    - How do I wait for *any* child process to terminate, not 
 just a
      specific Pid?
I will write a separate post about this shortly.
 - execute() and shell(): I'm a bit concerned about returning the
   *entire* output of a process as a string. What if the output 
 generates
   too much output to store in a string? Would it be better to 
 return a
   range instead (either a range of chars or range of lines 
 maybe)? Or is
   this what pipeProcess was intended for? In any case, would it 
 make
   sense to specify some kind of upper limit to the size of the 
 output so
   that the program won't be vulnerable to bad subprocess 
 behaviour
   (generate infinite output, etc.)?
Again, Steve answered this, but let me clarify: There are three layers of functionality in this module. 1. spawnProcess() gives you detailed control over process creation, in a cross-platform manner. 2. pipeProcess()/pipeShell() are convenience functions that take care of creating pipes to the child process for you, as this is a common use case. Use this if you need detailed control over what goes in and out of the process. 3. execute()/shell() are a second layer of convenience, if you just want a simple way to execute a process and retrieve its output. If there is any chance of the process producing vast amounts of output, you really should use pipeProcess() or even spawnProcess(). Of course, it is a simple matter to add an optional maxOutputSize parameter to execute() and an "overflow" flag to the returned tuple, but I think it adds more API clutter than it is worth. I'd love to hear others' opinions about it, though.
 - ProcessException: are there any specific methods to help user 
 code
   extract information about the error? Or is the user expected 
 to check
   errno himself (on Posix; or whatever it is on Windows)?
Well, no. The thing is, ProcessException may be associated with an errno code, with a Windows GetLastError() code, or it may simply be a pure "D error" with no relation at all to the underlying C APIs. Personally, I believe that the distinction between ProcessException (for process management errors) and StdioException (for I/O errors, mostly related to pipes) provides enough granularity for most use cases. I believe it's the best we can do in a cross-platform manner without inventing our own error codes and mapping them to errno/GetLastError() codes. If anyone needs the underlying OS error code, they can still retrieve them using the system-specific APIs. Again, I am of course open for discussion about this. Lars
Feb 24 2013
prev sibling next sibling parent reply "Jakob Bornecrantz" <wallbraker gmail.com> writes:
On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. 
Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :)  
 The upshot is that the module has actually seen active use over 
 those years, both by yours truly and others, so hopefully the 
 worst wrinkles are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

 I hope we can get it reviewed in time for the next release.  
 (The wiki page indicates that both std.benchmark and std.uni 
 are currently being reviewed, but I fail to find any "official" 
 review threads on the forum.  Is the wiki just out of date?)
Does these include the changes that Alex/Zor did on it? I submitted a "bug" report that he fixed. Cheers, Jakob.
Feb 23 2013
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 24-02-2013 01:27, Jakob Bornecrantz wrote:
 On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :) The
 upshot is that the module has actually seen active use over those
 years, both by yours truly and others, so hopefully the worst wrinkles
 are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html


 I hope we can get it reviewed in time for the next release. (The wiki
 page indicates that both std.benchmark and std.uni are currently being
 reviewed, but I fail to find any "official" review threads on the
 forum.  Is the wiki just out of date?)
Does these include the changes that Alex/Zor did on it? I submitted a "bug" report that he fixed. Cheers, Jakob.
Yes, Lars pulled in ~all of my changes. -- Alex Rønne Petersen alex alexrp.com / alex lycus.org http://lycus.org
Feb 23 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 18:39:10 H. S. Teoh wrote:
 Alternatively, I would push for renaming the old std.process to
 something like old.process (or something else), which is much less of a
 breakage than deleting it from Phobos outright -- existing code just
 need to have their imports fixed and will continue working, whereas
 deleting the module outright leaves existing code with no recourse but
 to potentially rewrite from scratch. This may be easier to convince
 Walter & Andrei on, than outright killing old deprecated modules.
Possibly, but Walter takes a very dim view on most any code breakage, even if it means simply changing a makefile to make your code work again, so I'd be very surprised if he thought that moving the current std.process would be acceptable. If Andrei could be convinced, then we could probably do it, but I wouldn't expect him to agree, and IIRC, he had no problem with the std.process2 scheme and might even have suggested it. So, I suspect that your only hope of avoiding std.process2 is if you can come up with a better name. - Jonathan M Davis
Feb 23 2013
parent reply "Nathan M. Swan" <nathanmswan gmail.com> writes:
Jonathan M Davis wrote:
 On Saturday, February 23, 2013 18:39:10 H. S. Teoh wrote:
 Alternatively, I would push for renaming the old std.process to
 something like old.process (or something else), which is much less of a
 breakage than deleting it from Phobos outright -- existing code just
 need to have their imports fixed and will continue working, whereas
 deleting the module outright leaves existing code with no recourse but
 to potentially rewrite from scratch. This may be easier to convince
 Walter & Andrei on, than outright killing old deprecated modules.
Possibly, but Walter takes a very dim view on most any code breakage, even if it means simply changing a makefile to make your code work again, so I'd be very surprised if he thought that moving the current std.process would be acceptable. If Andrei could be convinced, then we could probably do it, but I wouldn't expect him to agree, and IIRC, he had no problem with the std.process2 scheme and might even have suggested it. So, I suspect that your only hope of avoiding std.process2 is if you can come up with a better name. - Jonathan M Davis
Why not just deprecate everything currently in std.process and drop in the new stuff? It might be a bit ugly, but it prevents both code breakage _and_ a proliferation of "std.module2"s. My 2 cents, NMS
Feb 23 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 19:47:54 Nathan M. Swan wrote:
 Jonathan M Davis wrote:
 On Saturday, February 23, 2013 18:39:10 H. S. Teoh wrote:
 Alternatively, I would push for renaming the old std.process to
 something like old.process (or something else), which is much less of a
 breakage than deleting it from Phobos outright -- existing code just
 need to have their imports fixed and will continue working, whereas
 deleting the module outright leaves existing code with no recourse but
 to potentially rewrite from scratch. This may be easier to convince
 Walter & Andrei on, than outright killing old deprecated modules.
Possibly, but Walter takes a very dim view on most any code breakage, even if it means simply changing a makefile to make your code work again, so I'd be very surprised if he thought that moving the current std.process would be acceptable. If Andrei could be convinced, then we could probably do it, but I wouldn't expect him to agree, and IIRC, he had no problem with the std.process2 scheme and might even have suggested it. So, I suspect that your only hope of avoiding std.process2 is if you can come up with a better name. - Jonathan M Davis
Why not just deprecate everything currently in std.process and drop in the new stuff? It might be a bit ugly, but it prevents both code breakage _and_ a proliferation of "std.module2"s.
That only works if there are no conflicts and none of the functions' behaviors are changed in a manner which would be incompatible with how they currently work. We _were_ able to do exactly what you're suggesting with std.path, but it isn't always possible. std.random would be a prime case where it would be a problem, because one of the main changes that we want to do is translate all of its structs into classes, which would break a lot of code, and wouldn't be compatible at all - not unless you come up with new names for everything, which would be downright ugly. I don't know how much the new std.process conflicts with the old one, but since the main fellow behind the new one is the same one who did the new std.path, I suspect that they conflict to much to be able to do the transition within a single module. I'd have to compare them to be sure though. - Jonathan M Davis
Feb 23 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 04:54:13 UTC, Jonathan M Davis 
wrote:
 On Saturday, February 23, 2013 19:47:54 Nathan M. Swan wrote:
 Why not just deprecate everything currently in std.process and 
 drop in
 the new stuff? It might be a bit ugly, but it prevents both 
 code
 breakage _and_ a proliferation of "std.module2"s.
That only works if there are no conflicts and none of the functions' behaviors are changed in a manner which would be incompatible with how they currently work. [...] I don't know how much the new std.process conflicts with the old one, but since the main fellow behind the new one is the same one who did the new std.path, I suspect that they conflict to much to be able to do the transition within a single module. I'd have to compare them to be sure though.
Let me save you the trouble. There are three points of incompatibility between the old and the new APIs, that prevent them from coexisting: shell(): The old shell() captures the standard output, returns only a string containing the output, and throws an exception if the program returns with a nonzero exit code. The new shell() only throws if there was an error with actually running the process, it captures both the standard output AND standard error streams, and it returns a tuple which contains both the output and the exit code. environment.opIndex(): The new version no longer throws if the environment variable does not exist, it simply returns null. This is a very subtle change that will not cause a compilation error. It was I who wrote the old 'environment' as well, and the idea was for it to have the exact same interface as the built-in associative arrays. Therefore, at the time, it seemed like a good idea to throw from opIndex(). However, having actually used it for quite some time, I have found that I *never* want that functionality. I *always* end up calling environment.get() instead, to avoid the exception. Thus, I decided use this opportunity to change it. environment.get(): To distinguish it from opIndex(), the second argument (the default value if the variable doesn't exist) is no longer optional. This, however, will cause a compilation error in the cases where the second argument has been left out, and all other cases will work like they used to. I would absolutely *hate* to have to change the name of shell(), or, worse, revert it to the old version. I am reluctant to revert environment.opIndex() too, as I would really like to get it right this time. I am not going to be as adamant on this, however. environment.get() is not a big deal, but if environment.opIndex() changes, this might as well do too. Lars
Feb 24 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 24 February 2013 at 14:43:50 UTC, Lars T. Kyllingstad 
wrote:
 [snip]
Hi Lars, First of all, about environment. I think the old behavior makes more sense. I think you had a good point about making it behave like an associative array. I would expect using opIndex with an inexisting key to throw. Subtle deviations of behavior for types that generally behave like well-known types can introduce latent bugs. The danger is even more potent in the case of environment variables, as those are often used for constructing command-lines and such. If attempting to get the value of an inexisting variable now returns null, which is used to build a command line, unexpected things can happen. For example, let's say that you're writing a program for analyzing malware, which expects $BINDUMP to be set to the path of some analysis tool. So it runs environment["BINDUMP"] ~ args[1] - however, if BINDUMP is unset, the program runs the malware executable itself. For another example, here's this classic catastrophic bug in shell scripts: rm -rf $FOO/$BAR What happens if $FOO and $BAR are unset? One thing that I think is missing from the environment object is opIn_r. Implementing opIn_r would allow users to more safely explicitly check if a variable is set or not, and is more readable than environment.get("FOO", null). And of course, there's the issue of people migrating code from the old module version to the new one: if they relied on the old behavior, the code can break in unexpected ways after the migration. What are your specific reasons for changing environment's behavior? Speaking of shells, I noticed you hardcode cmd.exe in std.process2. That's another bug, it should look at the COMSPEC variable. Also, about the shell function, I noticed this recent bug report: http://d.puremagic.com/issues/show_bug.cgi?id=9444 Maybe it somehow makes the transition to the new function easier? :) If not, since you're adamant about not changing the name, can we overload the function (e.g. make the new one return some results in "out" parameters), and deprecate the original overload? Finally, I'd just like to sum up that we seem to have two decisions on the scales: somehow solving the API incompatibilities, or introducing the new version as an entirely new module. The latter is a mess we really don't want to get into, so it'd need to be justified, and IMHO the incompatibilities don't seem to be as severe and unresolvable to warrant that mess.
Feb 24 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 00:15:21 UTC, Vladimir Panteleev 
wrote:
 On Sunday, 24 February 2013 at 14:43:50 UTC, Lars T. 
 Kyllingstad wrote:
 [snip]
Hi Lars, First of all, about environment. I think the old behavior makes more sense. I think you had a good point about making it behave like an associative array. I would expect using opIndex with an inexisting key to throw. Subtle deviations of behavior for types that generally behave like well-known types can introduce latent bugs. The danger is even more potent in the case of environment variables, as those are often used for constructing command-lines and such. If attempting to get the value of an inexisting variable now returns null, which is used to build a command line, unexpected things can happen. For example, let's say that you're writing a program for analyzing malware, which expects $BINDUMP to be set to the path of some analysis tool. So it runs environment["BINDUMP"] ~ args[1] - however, if BINDUMP is unset, the program runs the malware executable itself. For another example, here's this classic catastrophic bug in shell scripts: rm -rf $FOO/$BAR What happens if $FOO and $BAR are unset? One thing that I think is missing from the environment object is opIn_r. Implementing opIn_r would allow users to more safely explicitly check if a variable is set or not, and is more readable than environment.get("FOO", null). And of course, there's the issue of people migrating code from the old module version to the new one: if they relied on the old behavior, the code can break in unexpected ways after the migration. What are your specific reasons for changing environment's behavior? Speaking of shells, I noticed you hardcode cmd.exe in std.process2. That's another bug, it should look at the COMSPEC variable. Also, about the shell function, I noticed this recent bug report: http://d.puremagic.com/issues/show_bug.cgi?id=9444 Maybe it somehow makes the transition to the new function easier? :) If not, since you're adamant about not changing the name, can we overload the function (e.g. make the new one return some results in "out" parameters), and deprecate the original overload? Finally, I'd just like to sum up that we seem to have two decisions on the scales: somehow solving the API incompatibilities, or introducing the new version as an entirely new module. The latter is a mess we really don't want to get into, so it'd need to be justified, and IMHO the incompatibilities don't seem to be as severe and unresolvable to warrant that mess.
Sorry, I have to get to work now, and I don't have time to answer your post properly. I will say this, though: You make a strong case about environment.opIndex(). :) I'll think some more about it and write a proper reply later. Lars
Feb 24 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 00:15:21 UTC, Vladimir Panteleev 
wrote:
 On Sunday, 24 February 2013 at 14:43:50 UTC, Lars T. 
 Kyllingstad wrote:
 [snip]
Hi Lars, First of all, about environment. I think the old behavior makes more sense. I think you had a good point about making it behave like an associative array. I would expect using opIndex with an inexisting key to throw. Subtle deviations of behavior for types that generally behave like well-known types can introduce latent bugs. The danger is even more potent in the case of environment variables, as those are often used for constructing command-lines and such. If attempting to get the value of an inexisting variable now returns null, which is used to build a command line, unexpected things can happen. For example, let's say that you're writing a program for analyzing malware, which expects $BINDUMP to be set to the path of some analysis tool. So it runs environment["BINDUMP"] ~ args[1] - however, if BINDUMP is unset, the program runs the malware executable itself. For another example, here's this classic catastrophic bug in shell scripts: rm -rf $FOO/$BAR What happens if $FOO and $BAR are unset? One thing that I think is missing from the environment object is opIn_r. Implementing opIn_r would allow users to more safely explicitly check if a variable is set or not, and is more readable than environment.get("FOO", null). And of course, there's the issue of people migrating code from the old module version to the new one: if they relied on the old behavior, the code can break in unexpected ways after the migration. What are your specific reasons for changing environment's behavior?
My reasons were what I said in my other post: In the time I have been using the 'environment' API -- that is, for 2 1/2 years (I checked) -- I don't think there is a *single* time when I've chosen environment[var] over environment.get(var, null). The thing about the process environment, as opposed to an associative array inside your own program, is that you can never be certain which variables are defined and which aren't. This means that you will almost *always* have to check whether a variable exists before using it, thus rendering opIndex() pretty much useless for most cases. Furthermore, I really don't think it is too much to expect that a user of a systems language such as D checks the return values of functions that may return a 'null' value. However, I also think that quick'n dirty scripting is an extremely compelling use case for D, and in that case, your point is well taken. (I also get your arguments about backwards compatibility and not deviating from the AA interface, but that was what did it for me.) I am now on the fence about this.
 Speaking of shells, I noticed you hardcode cmd.exe in 
 std.process2. That's another bug, it should look at the COMSPEC 
 variable.
Thanks, I didn't know that. On POSIX, the -c switch is standard, and works on most, if not all, shells. Can we assume that /C is equally standardised on Windows shells?
 Also, about the shell function, I noticed this recent bug 
 report:
 http://d.puremagic.com/issues/show_bug.cgi?id=9444

 Maybe it somehow makes the transition to the new function 
 easier? :)
Hehe. :)
 If not, since you're adamant about not changing the name, can 
 we overload the function (e.g. make the new one return some 
 results in "out" parameters), and deprecate the original 
 overload?

 Finally, I'd just like to sum up that we seem to have two 
 decisions on the scales: somehow solving the API 
 incompatibilities, or introducing the new version as an 
 entirely new module. The latter is a mess we really don't want 
 to get into, so it'd need to be justified, and IMHO the 
 incompatibilities don't seem to be as severe and unresolvable 
 to warrant that mess.
Let us see where the discussion about command quoting ends up. It is going to have an impact on most of the API, shell() included. Lars
Feb 25 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 19:28:33 UTC, Lars T. Kyllingstad 
wrote:
 My reasons were what I said in my other post:  In the time I 
 have been using the 'environment' API -- that is, for 2 1/2 
 years (I checked) -- I don't think there is a *single* time 
 when I've chosen environment[var] over environment.get(var, 
 null).

 The thing about the process environment, as opposed to an 
 associative array inside your own program, is that you can 
 never be certain which variables are defined and which aren't.
Indeed. If a program expects a variable to be set when it isn't, the program should fail. Throwing in opIndex is one way to implement that.
 This means that you will almost *always* have to check whether 
 a variable exists before using it, thus rendering opIndex() 
 pretty much useless for most cases.
Check... and do what? Print a nicer error message?
 Furthermore, I really don't think it is too much to expect that 
 a user of a systems language such as D checks the return values 
 of functions that may return a 'null' value.
Expecting any sorts of things from the library user is not the way to go. It is the same reason why returning integer error codes has gone by way of history in favor of exception handling: checking all return values is cumbersome, it requires writing more code, many programmers don't do it, and the result is bad programs. The simplest code should also be correct, this is one of D's principles.
 Speaking of shells, I noticed you hardcode cmd.exe in 
 std.process2. That's another bug, it should look at the 
 COMSPEC variable.
Thanks, I didn't know that. On POSIX, the -c switch is standard, and works on most, if not all, shells. Can we assume that /C is equally standardised on Windows shells?
I believe so.
Feb 25 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 19:38:59 UTC, Vladimir Panteleev 
wrote:
 On Monday, 25 February 2013 at 19:28:33 UTC, Lars T. 
 Kyllingstad wrote:
 This means that you will almost *always* have to check whether 
 a variable exists before using it, thus rendering opIndex() 
 pretty much useless for most cases.
Check... and do what? Print a nicer error message?
That would depend on the application.
 Furthermore, I really don't think it is too much to expect 
 that a user of a systems language such as D checks the return 
 values of functions that may return a 'null' value.
Expecting any sorts of things from the library user is not the way to go. It is the same reason why returning integer error codes has gone by way of history in favor of exception handling: checking all return values is cumbersome, it requires writing more code, many programmers don't do it, and the result is bad programs. The simplest code should also be correct, this is one of D's principles.
Exceptions are designed to handle exceptional cases. A missing environment variable isn't exceptional, it is commonplace. Lars
Feb 25 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Feb 2013 15:09:14 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:


 Exceptions are designed to handle exceptional cases.  A missing  
 environment variable isn't exceptional, it is commonplace.
+1 -Steve
Feb 25 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 20:09:14 UTC, Lars T. Kyllingstad 
wrote:
 Exceptions are designed to handle exceptional cases.  A missing 
 environment variable isn't exceptional, it is commonplace.
I disagree. I don't know your uses cases, but as far as I can see, if the program expects the variable to be present in the environment, then it is no different from a missing file which the program expects to be present, or malformed user input.
 That would depend on the application.
Could you provide a specific example? It's difficult to discuss the merits of either approach without some use cases. As I see, there are two major cases: 1) The program expects a variable to be set. An example of this is COMSPEC / SHELL. These variables ought to be set on any system, so the user is not expected to verify this himself. The variables not being set is an exceptional sitation. 2) A variable may or may not be set, such as the case of passing additional options via the environment (such as INCLUDE, or LD_PRELOAD). The program will take specific action if the variable is not set, such as pretending it is empty, or defaulting to some other setting like one in a configuration file. It seems like your approach caters to the second situation exclusively. I've mentioned the problems of applying this approach to the first situation in my previous post.
Feb 25 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, February 25, 2013 21:21:53 Vladimir Panteleev wrote:
 On Monday, 25 February 2013 at 20:09:14 UTC, Lars T. Kyllingstad
 
 wrote:
 Exceptions are designed to handle exceptional cases. A missing
 environment variable isn't exceptional, it is commonplace.
I disagree. I don't know your uses cases, but as far as I can see, if the program expects the variable to be present in the environment, then it is no different from a missing file which the program expects to be present, or malformed user input.
Agreed. I would think that it would make sense for opIndex to throw, and get to return null. Unlike AAs, opIndex would throw an exception rather than an error, but I think that the situation is very similar to that of trying to open a file. - Jonathan M Davis
Feb 25 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 20:21:55 UTC, Vladimir Panteleev
wrote:
 On Monday, 25 February 2013 at 20:09:14 UTC, Lars T. 
 Kyllingstad wrote:
 Exceptions are designed to handle exceptional cases.  A 
 missing environment variable isn't exceptional, it is 
 commonplace.
I disagree. I don't know your uses cases, but as far as I can see, if the program expects the variable to be present in the environment, then it is no different from a missing file which the program expects to be present, or malformed user input.
 That would depend on the application.
Could you provide a specific example? It's difficult to discuss the merits of either approach without some use cases.
Well, take Phobos, for instance. Besides std.process, there are two places where environment/getenv is used: std.file.tempDir() (POSIX version) and std.path.expandTilde(). None of them throw if the variables in question don't exist, they both take some default action instead.
 As I see, there are two major cases:

 1) The program expects a variable to be set. An example of this 
 is COMSPEC / SHELL. These variables ought to be set on any 
 system, so the user is not expected to verify this himself. The 
 variables not being set is an exceptional sitation.

 2) A variable may or may not be set, such as the case of 
 passing additional options via the environment (such as 
 INCLUDE, or LD_PRELOAD). The program will take specific action 
 if the variable is not set, such as pretending it is empty, or 
 defaulting to some other setting like one in a configuration 
 file.

 It seems like your approach caters to the second situation 
 exclusively. I've mentioned the problems of applying this 
 approach to the first situation in my previous post.
What if the variable is set, but empty? Is that very different from the situation where it doesn't exist at all? In my opinion, when it comes to environment variables, no. You mention 'rm -rf $FOO/$BAR' as a "classic catastrophic bug" in shell scripts. This is just as much a problem if FOO and BAR are simply empty, and throwing from opIndex() won't help you with that. You still have to test for empty(), which would *also* test for null. Lars
Feb 25 2013
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, February 26, 2013 08:08:33 Lars T. Kyllingstad wrote:
 What if the variable is set, but empty?  Is that very different
 from the situation where it doesn't exist at all?  In my opinion,
 when it comes to environment variables, no.
And yet, there _is_ a difference. I've dealt with code before that simply cared about whether an environment variable was set and not at all what it was set to. Regardless of whether that's desirable behavior, any program that needs to be compatible with a program that follows that behavior will need to be able to follow that behavior as well. So, if std.process is set up so that you can't tell the difference betwen an environment variable which hasn't been set and one that's been set to nothing, then that's a problem, even if it's not the most common case. - Jonathan M Davis
Feb 25 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 February 2013 at 07:16:34 UTC, Jonathan M Davis 
wrote:
 On Tuesday, February 26, 2013 08:08:33 Lars T. Kyllingstad 
 wrote:
 What if the variable is set, but empty?  Is that very different
 from the situation where it doesn't exist at all?  In my 
 opinion,
 when it comes to environment variables, no.
And yet, there _is_ a difference. I've dealt with code before that simply cared about whether an environment variable was set and not at all what it was set to. Regardless of whether that's desirable behavior, any program that needs to be compatible with a program that follows that behavior will need to be able to follow that behavior as well. So, if std.process is set up so that you can't tell the difference betwen an environment variable which hasn't been set and one that's been set to nothing, then that's a problem, even if it's not the most common case.
You can tell the difference. In the former case environment.opIndex() will return null, in the latter it will return "". Use 'is null' to determine which. Lars
Feb 25 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/26/13 2:16 AM, Jonathan M Davis wrote:
 On Tuesday, February 26, 2013 08:08:33 Lars T. Kyllingstad wrote:
 What if the variable is set, but empty?  Is that very different
 from the situation where it doesn't exist at all?  In my opinion,
 when it comes to environment variables, no.
And yet, there _is_ a difference. I've dealt with code before that simply cared about whether an environment variable was set and not at all what it was set to. Regardless of whether that's desirable behavior, any program that needs to be compatible with a program that follows that behavior will need to be able to follow that behavior as well. So, if std.process is set up so that you can't tell the difference betwen an environment variable which hasn't been set and one that's been set to nothing, then that's a problem, even if it's not the most common case. - Jonathan M Davis
Guess we could go with returning a null string if nonexistent and "" if it does. This makes the implementation space about as subtle as the problem space. Andrei
Feb 26 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 07:08:37 UTC, Lars T. Kyllingstad 
wrote:
 What if the variable is set, but empty?  Is that very different
 from the situation where it doesn't exist at all?  In my 
 opinion,
 when it comes to environment variables, no.
Until today, I didn't know that empty variables could exist. They don't exist on Windows: setting a variable to an empty string is how you delete it. Regardless, I think my point still stands on the argument that it's much more likely for a variable to be unexpectedly unset rather than unexpectedly empty. To extend to a general case, we could say that an empty variable is as likely as any invalid or unexpected value. For the 'rm -rf $FOO/$BAR' case, one can come up with any combinations of FOO and BAR, such as "/bin" and "../", where the command would have the same effect.
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 07:28:11 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 07:08:37 UTC, Lars T. Kyllingstad wrote:
 What if the variable is set, but empty?  Is that very different
 from the situation where it doesn't exist at all?  In my opinion,
 when it comes to environment variables, no.
Until today, I didn't know that empty variables could exist. They don't exist on Windows: setting a variable to an empty string is how you delete it. Regardless, I think my point still stands on the argument that it's much more likely for a variable to be unexpectedly unset rather than unexpectedly empty. To extend to a general case, we could say that an empty variable is as likely as any invalid or unexpected value. For the 'rm -rf $FOO/$BAR' case, one can come up with any combinations of FOO and BAR, such as "/bin" and "../", where the command would have the same effect.
If I use $XYZ in a script, and XYZ isn't set, it equates to nothing. When I use getenv, it returns null. That is the behavior I would intuitively expect. On one hand, I think the correct behavior is to return null, and let the program deal with checking the error, or use get if they have a default. If we throw an exception, people will end up catching the exception in order to avoid an unintended error. Exceptions are not good for flow control, they are for exceptional situations that you didn't plan for. On the other hand, the other implementation, which is already in std.process, is also a valid implementation, and there already exists code which uses it. If we have the same abilities via get, then no functionality is lost, it's just a tad more verbose. In my opinion, the current implementation is only slightly worse than the proposed. If there is a chance we can simply replace std.process instead of using std.process2 if we go with the original implementation, I think we should do that. -Steve
Feb 26 2013
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 14:02:08 UTC, Steven 
Schveighoffer wrote:
 If I use $XYZ in a script, and XYZ isn't set, it equates to 
 nothing.  When I use getenv, it returns null.  That is the 
 behavior I would intuitively expect.
I thought well-written scripts should use "set -u"?
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 09:15:05 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 14:02:08 UTC, Steven Schveighoffer wrote:
 If I use $XYZ in a script, and XYZ isn't set, it equates to nothing.   
 When I use getenv, it returns null.  That is the behavior I would  
 intuitively expect.
I thought well-written scripts should use "set -u"?
I didn't even know about that. But my point still stands -- if well-written scripts are supposed to use set -u, it should be the default. Hm... what about something like Environment.throwOnUnsetVariable = true; -Steve
Feb 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 14:26:22 UTC, Steven 
Schveighoffer wrote:
 On Tue, 26 Feb 2013 09:15:05 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 14:02:08 UTC, Steven 
 Schveighoffer wrote:
 If I use $XYZ in a script, and XYZ isn't set, it equates to 
 nothing.  When I use getenv, it returns null.  That is the 
 behavior I would intuitively expect.
I thought well-written scripts should use "set -u"?
I didn't even know about that. But my point still stands -- if well-written scripts are supposed to use set -u, it should be the default.
Pretty sure it can't be the default due to backwards-compatibility reasons.
 Hm... what about something like 
 Environment.throwOnUnsetVariable = true;
That would break with programs using distinct components that rely on that setting's value...
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 09:45:31 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 14:26:22 UTC, Steven Schveighoffer wrote:
 On Tue, 26 Feb 2013 09:15:05 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 14:02:08 UTC, Steven Schveighoffer  
 wrote:
 If I use $XYZ in a script, and XYZ isn't set, it equates to nothing.   
 When I use getenv, it returns null.  That is the behavior I would  
 intuitively expect.
I thought well-written scripts should use "set -u"?
I didn't even know about that. But my point still stands -- if well-written scripts are supposed to use set -u, it should be the default.
Pretty sure it can't be the default due to backwards-compatibility reasons.
 Hm... what about something like Environment.throwOnUnsetVariable = true;
That would break with programs using distinct components that rely on that setting's value...
They would just segfault instead of throwing an exception, no? I think people would understand those consequences, but they could be spelled out in the docs for that property. We could also make the default true, since that is what existing code currently expects. -Steve
Feb 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 14:56:37 UTC, Steven 
Schveighoffer wrote:
 That would break with programs using distinct components that 
 rely on that setting's value...
They would just segfault instead of throwing an exception, no? I think people would understand those consequences, but they could be spelled out in the docs for that property. We could also make the default true, since that is what existing code currently expects.
I'm not really following... What does a segfault have to do with it? What I meant is that you may use two components (two libraries, or the main program and one library) where one sets environment.throwOnUnsetVariable. If at least one component assumes that the setting is enabled or disabled, and the other component does not restore the previous setting after changing it, then the first component's behavior will change in an unexpected way.
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 10:19:04 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 14:56:37 UTC, Steven Schveighoffer wrote:
 That would break with programs using distinct components that rely on  
 that setting's value...
They would just segfault instead of throwing an exception, no? I think people would understand those consequences, but they could be spelled out in the docs for that property. We could also make the default true, since that is what existing code currently expects.
I'm not really following... What does a segfault have to do with it? What I meant is that you may use two components (two libraries, or the main program and one library) where one sets environment.throwOnUnsetVariable. If at least one component assumes that the setting is enabled or disabled, and the other component does not restore the previous setting after changing it, then the first component's behavior will change in an unexpected way.
You mean changing as in, instead of throwing an exception, it tries to use a null value and segfaults? Not a very significant difference. But we are splitting hairs here. The first one could potentially change the environment variable that the second uses, thereby affecting the behavior. -Steve
Feb 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 15:26:50 UTC, Steven 
Schveighoffer wrote:
 You mean changing as in, instead of throwing an exception, it 
 tries to use a null value and segfaults?  Not a very 
 significant difference.
I'm still not following... where would the segfault come from? Unless you dereference .ptr, you can't get a segfault from operating on a null string.
 But we are splitting hairs here.  The first one could 
 potentially change the environment variable that the second 
 uses, thereby affecting the behavior.
That's a completely different matter from changing how code within the same program accesses the environment in general. Both components may be operating on specialized, prefix-named variables that have no chance of interfering with each other, and still break when the behavior of a global object changes. It would be safer for the component to define a very small wrapper, which changes environment's semantics according to its requirements.
Feb 26 2013
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 15:47:41 UTC, Vladimir Panteleev 
wrote:
 That's a completely different matter from changing how code 
 within the same program accesses the environment in general. 
 Both components may be operating on specialized, prefix-named 
 variables that have no chance of interfering with each other, 
 and still break when the behavior of a global object changes.
To go back to the filesystem analogy, such a setting would be the equivalent of a global boolean variable in std.file, which made std.file.read return null if a file didn't exist.
Feb 26 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 10:47:40 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 15:26:50 UTC, Steven Schveighoffer wrote:
 You mean changing as in, instead of throwing an exception, it tries to  
 use a null value and segfaults?  Not a very significant difference.
I'm still not following... where would the segfault come from? Unless you dereference .ptr, you can't get a segfault from operating on a null string.
It's funny, I completely forgot about that! My brain was still in Objective-C/C++ mode :) You are right, the difference is important.
 But we are splitting hairs here.  The first one could potentially  
 change the environment variable that the second uses, thereby affecting  
 the behavior.
That's a completely different matter from changing how code within the same program accesses the environment in general. Both components may be operating on specialized, prefix-named variables that have no chance of interfering with each other, and still break when the behavior of a global object changes. It would be safer for the component to define a very small wrapper, which changes environment's semantics according to its requirements.
I was trying to come up with an equivalent to set -u. In other words, you said "good scripts use set -u", you could equivalently say "good D programs use Enviroment.throwOnMissingData = true" It was just a thought. Another possibility (naming to be determined): Environment.nthrow["x"]; // doesn't throw Environment["x"]; // throws At this point though, I think the discussion really is bikeshedding. Environment.get is not that much different than Environment[]. We should keep the current behavior in the interest of backwards compatibility. -Steve
Feb 26 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 February 2013 at 16:37:26 UTC, Steven 
Schveighoffer wrote:
 At this point though, I think the discussion really is 
 bikeshedding.  Environment.get is not that much different than 
 Environment[].  We should keep the current behavior in the 
 interest of backwards compatibility.
I, too, have come to agree with this. Let us drop the environment discussion and focus on getting the spawnProcess() family of functions right. :) Lars
Feb 26 2013
prev sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, February 26, 2013 09:02:11 Steven Schveighoffer wrote:
 On one hand, I think the correct behavior is to return null, and let the
 program deal with checking the error, or use get if they have a default.
 If we throw an exception, people will end up catching the exception in
 order to avoid an unintended error. Exceptions are not good for flow
 control, they are for exceptional situations that you didn't plan for.
I think that it's far more correct to say that exceptions are for situations where it's reasonable for code to assume that something's the case when it might not be or when it's impossible for it to check. For instance, it's much cleaner to write a parser if the parser in general assumes that operations will succeed and throws when they don't. Then only a small part of the parser needs to worry about handling error cases. Or an example of when it would be impossible to check would be with file operations. You can (and should) check beforehand that the file exists, but there's no way to guarantee that the file will still exist when you actually operate on it (e.g. another process could delete it out from under you), so the file functions have to throw when the file isn't there anymore or you don't have permissions or whatever. If you have to keep checking return values for functions, then you should probably be using exceptions. The place to avoid exceptions is when the odds of an operation succeeding are low (or at least that there's a fairly good chance that it'll fail), because then it really is just becoming flow control. But I actually think that pushing for exceptions to be for "exceptional situations" is harmful, as that leads to people not using them, and the code ends up checking return values when it would be much cleaner if it didn't have to. Of course, there are plenty of people who are quite poor at that balance and end up over-using exceptions as well, so striking a good balance can be hard. In general though, the main question is whether it's reasonable for code to assume that an operation will succeed, and if it is, then an exception should be used. In the case of environment variables, whether that's reasonable or not depends on the code. There are programs out there which pretty much can't run if a particular environment variable hasn't been set (I deal with several at work which are that way), and if the program itself set the enviornment variable, then it should be able to assume that it's there. In either case, the exception route makes more sense. On plenty of other occasions though, it's not at all reasonable to assume that it's there and the code needs to check, in which case, throwing an exception doesn't make sense at all. But given the fact that opIndex is already generally assumed to succeed, I think that it makes perfect sense to make it throw on failure, and then have get return null on failure. Programs can then use whichever behavior is correct for their particular needs. - Jonathan M Davis
Feb 26 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 26 February 2013 at 18:48:08 UTC, Jonathan M Davis 
wrote:
 On Tuesday, February 26, 2013 09:02:11 Steven Schveighoffer 
 wrote:
 On one hand, I think the correct behavior is to return null, 
 and let the
 program deal with checking the error, or use get if they have 
 a default.
 If we throw an exception, people will end up catching the 
 exception in
 order to avoid an unintended error. Exceptions are not good 
 for flow
 control, they are for exceptional situations that you didn't 
 plan for.
I think that it's far more correct to say that exceptions are for situations where it's reasonable for code to assume that something's the case when it might not be or when it's impossible for it to check. For instance, it's much cleaner to write a parser if the parser in general assumes that operations will succeed and throws when they don't. Then only a small part of the parser needs to worry about handling error cases. Or an example of when it would be impossible to check would be with file operations. You can (and should) check beforehand that the file exists, but there's no way to guarantee that the file will still exist when you actually operate on it (e.g. another process could delete it out from under you), so the file functions have to throw when the file isn't there anymore or you don't have permissions or whatever.
That is best explanation I've read on the subject. I'm dead serious.
 If you have to keep checking return values for functions, then 
 you should
 probably be using exceptions. The place to avoid exceptions is 
 when the odds
 of an operation succeeding are low (or at least that there's a 
 fairly good
 chance that it'll fail), because then it really is just 
 becoming flow control.
 But I actually think that pushing for exceptions to be for 
 "exceptional
 situations" is harmful, as that leads to people not using them, 
 and the code
 ends up checking return values when it would be much cleaner if 
 it didn't have
 to. Of course, there are plenty of people who are quite poor at 
 that balance
 and end up over-using exceptions as well, so striking a good 
 balance can be
 hard.
I want to add a point that you don't address here : it is easy to forgot to check return value. For instance, how much C code don't check the return value of printf ? I'd be surprised if it is even 1% . When you don't, and things fail, you program is in undefined state and start doing crap. As exception are costly only when the are thrown, it don't make any sense to not use them for speed, as doing crap very fast is rarely a goal that people want to achieve.
Mar 03 2013
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, March 03, 2013 12:11:16 deadalnix wrote:
 On Tuesday, 26 February 2013 at 18:48:08 UTC, Jonathan M Davis
 I think that it's far more correct to say that exceptions are
 for situations
 where it's reasonable for code to assume that something's the
 case when it
 might not be or when it's impossible for it to check. For
 instance, it's much
 cleaner to write a parser if the parser in general assumes that
 operations
 will succeed and throws when they don't. Then only a small part
 of the parser
 needs to worry about handling error cases. Or an example of
 when it would be
 impossible to check would be with file operations. You can (and
 should) check
 beforehand that the file exists, but there's no way to
 guarantee that the file
 will still exist when you actually operate on it (e.g. another
 process could
 delete it out from under you), so the file functions have to
 throw when the file
 isn't there anymore or you don't have permissions or whatever.
That is best explanation I've read on the subject. I'm dead serious.
Well, if you have to debate exceptions and/or explain them enough times, you start coming up with better explanations, or you never get anywhere, and the whole "exceptional circumstances" bit is just way too vague, and everyone interprets it differently. And too often, I've argued exceptions with someone who basically agreed with my opinion, but we both sucked at explaining what we meant, so we ended up arguing until we understood that. It's definitely one of those cases where examples make things clearer, and if you can distill what the examples have in common, well then you get an explanation like what I gave.
 I want to add a point that you don't address here : it is easy to
 forgot to check return value. For instance, how much C code don't
 check the return value of printf ? I'd be surprised if it is even
 1% . When you don't, and things fail, you program is in undefined
 state and start doing crap.
Definitely, that is one of the reasons that error codes are generally horrible.
 As exception are costly only when the
 are thrown, it don't make any sense to not use them for speed, as
 doing crap very fast is rarely a goal that people want to achieve.
The place that it makes sense to not use them when speed is a concern is when they're going to be thrown often or when the code in question simply cannot afford the extra time that that the exception costs even in the rare situations where one actually gets thrown (which is the sort of situation that games might run into given their insane performance constraints but most everything else won't). Also, I don't believe that the cost of try-catch blocks is actually zero (though it _is_ relatively low), even when no exceptions are thrown because of stuff that must be done in case an exception is thrown, but it's definitely true that in most cases, not using exceptions because of performance concerns is a mistake, and if your code is really going to be throwing exceptions enough that it would be a performance concern, then using an exception doesn't make sense anyway. That's back to using exceptions as flow control. - Jonathan M Davis
Mar 03 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Feb 23, 2013 at 06:46:13PM -0800, Jonathan M Davis wrote:
 On Saturday, February 23, 2013 18:39:10 H. S. Teoh wrote:
 Alternatively, I would push for renaming the old std.process to
 something like old.process (or something else), which is much less
 of a breakage than deleting it from Phobos outright -- existing code
 just need to have their imports fixed and will continue working,
 whereas deleting the module outright leaves existing code with no
 recourse but to potentially rewrite from scratch. This may be easier
 to convince Walter & Andrei on, than outright killing old deprecated
 modules.
Possibly, but Walter takes a very dim view on most any code breakage, even if it means simply changing a makefile to make your code work again, so I'd be very surprised if he thought that moving the current std.process would be acceptable. If Andrei could be convinced, then we could probably do it, but I wouldn't expect him to agree, and IIRC, he had no problem with the std.process2 scheme and might even have suggested it. So, I suspect that your only hope of avoiding std.process2 is if you can come up with a better name.
[...] I suppose std.proc is out of the question? ;-) I find this rather frustrating... sometimes it feels like Phobos is suffering from premature standardization - we have a module with a design that isn't very good, but just because it somehow got put into Phobos, now it has to stick, no matter what. That's what we should do *after* we have a good design, but at this point, the current std.process clearly isn't ready to be cast in stone yet, yet we insist it can't be changed (at least, not easily). So every little design mistake that got overlooked in review and made it into Phobos becomes stuck, even when the design is really only experimental to begin with. I think we should seriously consider the idea someone brought up in this forum recently, of an experimental section of Phobos where new stuff is put in, and subject to actual field-testing (as opposed to just toy test cases when it was written), before it goes into Phobos proper. I really do not want to see this problem repeated over and over, and we end up with 15 modules with 2 or 3 appended to their name just because nothing can be changed after it's put in. It really detracts from D's presentability, esp. to outsiders and prospective new users, IMO. T -- Windows 95 was a joke, and Windows 98 was the punchline.
Feb 23 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/24/13 4:58 AM, H. S. Teoh wrote:
 I find this rather frustrating... sometimes it feels like Phobos is
 suffering from premature standardization - we have a module with a
 design that isn't very good, but just because it somehow got put into
 Phobos, now it has to stick, no matter what.
It's a good sign - growing pains and acquiring users and all. Python broke even "hello, world" from one major release to another. Andrei
Feb 23 2013
parent reply "Don" <turnyourkidsintocash nospam.com> writes:
On Sunday, 24 February 2013 at 07:58:40 UTC, Andrei Alexandrescu 
wrote:
 On 2/24/13 4:58 AM, H. S. Teoh wrote:
 I find this rather frustrating... sometimes it feels like 
 Phobos is
 suffering from premature standardization - we have a module 
 with a
 design that isn't very good, but just because it somehow got 
 put into
 Phobos, now it has to stick, no matter what.
It's a good sign - growing pains and acquiring users and all. Python broke even "hello, world" from one major release to another. Andrei
I don't think this is true at all. With respect -- I think Walter has absolutely no clue about backwards compatibility and deprecation. Here's how it should work: 1. You make promises (about future compatibility). 2. You keep those promises. Walter tries to do (2). without doing (1). The result is the insanity we've had for years. It means an unpredictable, unplanned set of often undesirable behaviour is preserved, that doesn't help stability anyway. We need to do (1). Can we please stop pretending this is acceptable? It's not "growing pains" or anything like that. It's a basic misunderstanding of stability.
Feb 25 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-02-25 17:20, Don wrote:

 I don't think this is true at all.
 With respect -- I think Walter has absolutely no clue about backwards
 compatibility and deprecation.

 Here's how it should work:
 1. You make promises  (about future compatibility).
 2. You keep those promises.

 Walter tries to do (2). without doing (1). The result is the insanity
 we've had for years. It means an unpredictable, unplanned set of often
 undesirable behaviour is preserved, that doesn't help stability anyway.

 We need to do (1).

 Can we please stop pretending this is acceptable?
 It's not "growing pains" or anything like that. It's a basic
 misunderstanding of stability.
I completely agree. -- /Jacob Carlborg
Feb 25 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/23/2013 6:58 PM, H. S. Teoh wrote:
 On Sat, Feb 23, 2013 at 06:46:13PM -0800, Jonathan M Davis wrote:
 Possibly, but Walter takes a very dim view on most any code breakage,
 even if it means simply changing a makefile to make your code work
 again,
I find this rather frustrating...
Consider the common complaint from numerous people that "my code breaks with every new release". Even if the fix is "simple". Just today, rdmd doesn't compile anymore.
Feb 25 2013
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 02/25/2013 08:57 PM, Walter Bright wrote:
 On 2/23/2013 6:58 PM, H. S. Teoh wrote:
 On Sat, Feb 23, 2013 at 06:46:13PM -0800, Jonathan M Davis wrote:
 Possibly, but Walter takes a very dim view on most any code breakage,
 even if it means simply changing a makefile to make your code work
 again,
I find this rather frustrating...
Consider the common complaint from numerous people that "my code breaks with every new release".
I might be biased but I think a good portion of them are because of regressions.
 Even if the fix is "simple".
Simple fixes should be (semi-)automated by appropriate tooling.
 Just today, rdmd doesn't compile anymore.
Because of a Phobos regression.
Feb 25 2013
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-02-25 20:57, Walter Bright wrote:

 Consider the common complaint from numerous people that "my code breaks
 with every new release".

 Even if the fix is "simple".

 Just today, rdmd doesn't compile anymore.
Read Don's post: http://forum.dlang.org/thread/stxxtfwfrwllkcpunhue forum.dlang.org?page=10#post-rmisaclgytudqcdecvzb:40forum.dlang.org -- /Jacob Carlborg
Feb 25 2013
prev sibling parent reply Lee Braiden <leebraid gmail.com> writes:
On Mon, 25 Feb 2013 11:57:40 -0800, Walter Bright wrote:

 On 2/23/2013 6:58 PM, H. S. Teoh wrote:
 On Sat, Feb 23, 2013 at 06:46:13PM -0800, Jonathan M Davis wrote:
 Possibly, but Walter takes a very dim view on most any code breakage,
 even if it means simply changing a makefile to make your code work
 again,
I find this rather frustrating...
Consider the common complaint from numerous people that "my code breaks with every new release".
Yes, and as a compiled systems language, I think D needs to aim for compiling code from a decade ago, like GCC can (at least using -ansi etc.). It seems like a some people modify D core libraries like they would for Python, so that 3.0 code works, but 2.2 code doesn't etc. I don't think that's appropriate for D. Not if it wants to be taken as seriously as C/C ++, at least.
 
 Even if the fix is "simple".
 
 Just today, rdmd doesn't compile anymore.
It really would be nice to have rdmd included as part of DMD, and gdc etc. To me, it's a fundamental feature of the language, to be able to use it for scripts as well as pre-compiled code. If it was, and there were unit tests as part of releases (or even commits to master, say), then this problem of RDMD breakage wouldn't happen. -- Lee
Feb 26 2013
next sibling parent "pjmlp" <pjmlp progtools.org> writes:
On Tuesday, 26 February 2013 at 08:31:05 UTC, Lee Braiden wrote:
 On Mon, 25 Feb 2013 11:57:40 -0800, Walter Bright wrote:

 On 2/23/2013 6:58 PM, H. S. Teoh wrote:
 On Sat, Feb 23, 2013 at 06:46:13PM -0800, Jonathan M Davis 
 wrote:
 Possibly, but Walter takes a very dim view on most any code 
 breakage,
 even if it means simply changing a makefile to make your 
 code work
 again,
I find this rather frustrating...
Consider the common complaint from numerous people that "my code breaks with every new release".
Yes, and as a compiled systems language, I think D needs to aim for compiling code from a decade ago, like GCC can (at least using -ansi etc.). It seems like a some people modify D core libraries like they would for Python, so that 3.0 code works, but 2.2 code doesn't etc. I don't think that's appropriate for D. Not if it wants to be taken as seriously as C/C ++, at least.
To be honest, for those of us old enough C and C++ compilers went through the same process. One of the things that initially atracted me to Java was that my supposedly portable C and C++ code was riddled with #ifdefs to workaround compiler issues. -- Paulo
Feb 26 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-02-26 09:31, Lee Braiden wrote:

 It really would be nice to have rdmd included as part of DMD, and gdc
 etc.  To me, it's a fundamental feature of the language, to be able to
 use it for scripts as well as pre-compiled code.

 If it was, and there were unit tests as part of releases (or even commits
 to master, say), then this problem of RDMD breakage wouldn't happen.
RDMD is part of the DMD release. -- /Jacob Carlborg
Feb 26 2013
parent reply "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 26 February 2013 at 10:57:19 UTC, Jacob Carlborg 
wrote:
 On 2013-02-26 09:31, Lee Braiden wrote:

 It really would be nice to have rdmd included as part of DMD, 
 and gdc
 etc.  To me, it's a fundamental feature of the language, to be 
 able to
 use it for scripts as well as pre-compiled code.

 If it was, and there were unit tests as part of releases (or 
 even commits
 to master, say), then this problem of RDMD breakage wouldn't 
 happen.
RDMD is part of the DMD release.
Some linux packagers may not include it, for unknown misunderstanding. When dmd2 made its way into main Arch Linux repositories I had a small e-mail conversation with new maintainer as I had to prove using code from DPL/tools is fine license-wise - he was reluctant to put rdmd in by default.
Feb 26 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-02-26 12:02, Dicebot wrote:

 Some linux packagers may not include it, for unknown misunderstanding.
 When dmd2 made its way into main Arch Linux repositories I had a small
 e-mail conversation with new maintainer as I had to prove using code
 from DPL/tools is fine license-wise - he was reluctant to put rdmd in by
 default.
Seems weird. -- /Jacob Carlborg
Feb 26 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/26/2013 12:31 AM, Lee Braiden wrote:
 If it was, and there were unit tests as part of releases (or even commits
 to master, say), then this problem of RDMD breakage wouldn't happen.
That's a different issue. The issue I was talking about was working D code no longer compiling due to library changes.
Feb 26 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, February 23, 2013 18:58:28 H. S. Teoh wrote:
 I suppose std.proc is out of the question? ;-)
I don't know. Maybe.
 I find this rather frustrating... sometimes it feels like Phobos is
 suffering from premature standardization - we have a module with a
 design that isn't very good, but just because it somehow got put into
 Phobos, now it has to stick, no matter what. That's what we should do
 *after* we have a good design, but at this point, the current
 std.process clearly isn't ready to be cast in stone yet, yet we insist
 it can't be changed (at least, not easily). So every little design
 mistake that got overlooked in review and made it into Phobos becomes
 stuck, even when the design is really only experimental to begin with.
To some extent, I agree, but at the same time, we're taking forever to stabilize things, and unless we do, D will never take off, because no one will be able to rely on its API.
 I think we should seriously consider the idea someone brought up in this
 forum recently, of an experimental section of Phobos where new stuff is
 put in, and subject to actual field-testing (as opposed to just toy test
 cases when it was written), before it goes into Phobos proper.
I definitely think that this should be considered. I think that it's often the case that the stuff that makes it into Phobos was either created specifically for Phobos (and didn't get much use ahead of time), or it's someone's personal module that they thought would be useful (in which case, it may have gotten heavy use from them but not by many people besides them). And freezing APIs before they've been field-tested means that we'll be permanently stuck with subpar APIs.
 I really
 do not want to see this problem repeated over and over, and we end up
 with 15 modules with 2 or 3 appended to their name just because nothing
 can be changed after it's put in. It really detracts from D's
 presentability, esp. to outsiders and prospective new users, IMO.
I honestly wouldn't expect many modules to be replaced outright. It's mostly just the older ones which risk that. But if ever have to do that with modules that went through the full review process, then we need to rethink how that's done. A propationary area for modules (where they're in Phobos but not in std yet) may very well help mitigate any such problems. - Jonathan M Davis
Feb 23 2013
parent 1100110 <0b1100110 gmail.com> writes:
On 02/23/2013 09:07 PM, Jonathan M Davis wrote:
 On Saturday, February 23, 2013 18:58:28 H. S. Teoh wrote:
 I suppose std.proc is out of the question? ;-)
I don't know. Maybe.
 I find this rather frustrating... sometimes it feels like Phobos is
 suffering from premature standardization - we have a module with a
 design that isn't very good, but just because it somehow got put into
 Phobos, now it has to stick, no matter what. That's what we should do
 *after* we have a good design, but at this point, the current
 std.process clearly isn't ready to be cast in stone yet, yet we insist
 it can't be changed (at least, not easily). So every little design
 mistake that got overlooked in review and made it into Phobos becomes
 stuck, even when the design is really only experimental to begin with.
To some extent, I agree, but at the same time, we're taking forever to stabilize things, and unless we do, D will never take off, because no one will be able to rely on its API.
 I think we should seriously consider the idea someone brought up in this
 forum recently, of an experimental section of Phobos where new stuff is
 put in, and subject to actual field-testing (as opposed to just toy test
 cases when it was written), before it goes into Phobos proper.
I definitely think that this should be considered. I think that it's often the case that the stuff that makes it into Phobos was either created specifically for Phobos (and didn't get much use ahead of time), or it's someone's personal module that they thought would be useful (in which case, it may have gotten heavy use from them but not by many people besides them). And freezing APIs before they've been field-tested means that we'll be permanently stuck with subpar APIs.
 I really
 do not want to see this problem repeated over and over, and we end up
 with 15 modules with 2 or 3 appended to their name just because nothing
 can be changed after it's put in. It really detracts from D's
 presentability, esp. to outsiders and prospective new users, IMO.
I honestly wouldn't expect many modules to be replaced outright. It's mostly just the older ones which risk that. But if ever have to do that with modules that went through the full review process, then we need to rethink how that's done. A propationary area for modules (where they're in Phobos but not in std yet) may very well help mitigate any such problems. - Jonathan M Davis
std.future.process; and we've talked about it too much at this point. It will never be done.
Feb 24 2013
prev sibling next sibling parent reply =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 23.02.2013 12:31, schrieb Lars T. Kyllingstad:
 It's been years in the coming, but we finally got it done. :)  The
 upshot is that the module has actually seen active use over those years,
 both by yours truly and others, so hopefully the worst wrinkles are
 already ironed out.
 
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151
 
 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d
 
 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
 
 
 I hope we can get it reviewed in time for the next release.  (The wiki
 page indicates that both std.benchmark and std.uni are currently being
 reviewed, but I fail to find any "official" review threads on the
 forum.  Is the wiki just out of date?)
 
 Lars
I haven't read all responses (sorry), but considering that there don't seem to be API conflicts between the old and new std.process, why don't we just keep the old C style functions and deprecate them? No need to rename or create a separate module AFAICS.
Feb 24 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 10:23:36 UTC, Sönke Ludwig wrote:
 I haven't read all responses (sorry), but considering that 
 there don't
 seem to be API conflicts between the old and new std.process, 
 why don't
 we just keep the old C style functions and deprecate them? No 
 need to
 rename or create a separate module AFAICS.
There are API conflicts. See my reply to Jonathan. Lars
Feb 24 2013
prev sibling next sibling parent reply "Jonas Drewsen" <jdrewsen nospam.com> writes:
1, What about support for nonblocking wait(). It would be very 
nice not to block the main thread if you really don't care about 
waiting for the sub process but just want to be nice and not 
create zombies.

2, What about nonblocking read/writes or support for timing out 
reads/writes at least. On linux you can select() on the file 
descriptor but that is not supported on windows.

I believe that support for nonblocking operation is important 
since spawning external processes  is a common way to parallelize 
work and std.concurrency could take advantage of this as well.

/Jonas
Feb 24 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 08:03:24 -0500, Jonas Drewsen <jdrewsen nospam.com>  
wrote:

 1, What about support for nonblocking wait(). It would be very nice not  
 to block the main thread if you really don't care about waiting for the  
 sub process but just want to be nice and not create zombies.
Non-blocking wait was brought up. I think we can add it. It would be non-blocking wait on specific processes though, I think doing a wait for *any* process is a more difficult problem to solve, and is not supported in the current std.process. It may be something added later.
 2, What about nonblocking read/writes or support for timing out  
 reads/writes at least. On linux you can select() on the file descriptor  
 but that is not supported on windows.
This is not an issue with std.process, but rather with File. If File doesn't support non-blocking read/write, that is an issue, but we should solve it for all streams, not just pipes.
 I believe that support for nonblocking operation is important since  
 spawning external processes  is a common way to parallelize work and  
 std.concurrency could take advantage of this as well.
I agree, and would love to see File support non-blocking operations. -Steve
Feb 24 2013
parent "Jonas Drewsen" <jdrewsen nospam.com> writes:
On Sunday, 24 February 2013 at 14:44:51 UTC, Steven Schveighoffer 
wrote:
 On Sun, 24 Feb 2013 08:03:24 -0500, Jonas Drewsen 
 <jdrewsen nospam.com> wrote:

 1, What about support for nonblocking wait(). It would be very 
 nice not to block the main thread if you really don't care 
 about waiting for the sub process but just want to be nice and 
 not create zombies.
Non-blocking wait was brought up. I think we can add it. It would be non-blocking wait on specific processes though, I think doing a wait for *any* process is a more difficult problem to solve, and is not supported in the current std.process. It may be something added later.
I saw that you have implemented this now - great!.
 2, What about nonblocking read/writes or support for timing 
 out reads/writes at least. On linux you can select() on the 
 file descriptor but that is not supported on windows.
This is not an issue with std.process, but rather with File. If File doesn't support non-blocking read/write, that is an issue, but we should solve it for all streams, not just pipes.
Since File is just used to wrap the fd created in std.process it is not possible for File to make this non-blocking since windows doesn't have proper support for non-blocking fds that are not sockets e.g. files or pipes. The recommended way to communicate non-blocking between processes this way is to create a named pipe and do waitformultiple objects on that. /Jonas
Feb 24 2013
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. 
Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :)  
 The upshot is that the module has actually seen active use over 
 those years, both by yours truly and others, so hopefully the 
 worst wrinkles are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

 I hope we can get it reviewed in time for the next release.  
 (The wiki page indicates that both std.benchmark and std.uni 
 are currently being reviewed, but I fail to find any "official" 
 review threads on the forum.  Is the wiki just out of date?)
Ok, there have been several posts about changes and additions to wait(), and I believe Steve has answered them all, so let me just summarise: Non-blocking wait: This is a good idea, and it should be simple to implement. I'll do that. Wait for all processes: This would certainly be convenient, and it is simple to implement. It would require a static __gshared Pid[int] of all processes created by spawnProcess(), and we'd simply call wait() on all of those. The question, is this simple enough that we can just leave it to the user, thereby avoiding the need for a global Pid cache? (I think so.) Note that Windows has built-in functionality for waiting for a set of processes, namely WaitForMultipleObjects(). std.process2 would take advantage of that as a minor optimisation, but my guess is that the benefit would be negligible. Wait for any process: This one is a problem. POSIX supports it natively, but as has been pointed out by others, it would necessarily also affect processes *not* created through std.process. I don't see any obvious solution to this. On Windows, this is also supported through WaitForMultipleObjects(), but again, it would require a global Pid cache. I think this one is a no-go. Lars
Feb 24 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 10:01:12 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:


 Wait for all processes:

 This would certainly be convenient, and it is simple to implement.  It  
 would require a static __gshared Pid[int] of all processes created by  
 spawnProcess(), and we'd simply call wait() on all of those.
Waiting for ALL processes in an array should be a simple matter of some algorithm call. I didn't see this specific request. I think the request was to wait for any process in a specific subset to complete. And then of course, the "wait for any process." Either of these is possible similar to how core.thread keeps track of all threads, we could simply ignore any child exits that we didn't manage. And I think it's worth implementing at some point (this is not an easy problem to solve correctly), but not for this release. -Steve
Feb 24 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 15:37:10 UTC, Steven Schveighoffer 
wrote:
 On Sun, 24 Feb 2013 10:01:12 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:


 Wait for all processes:

 This would certainly be convenient, and it is simple to 
 implement.  It would require a static __gshared Pid[int] of 
 all processes created by spawnProcess(), and we'd simply call 
 wait() on all of those.
Waiting for ALL processes in an array should be a simple matter of some algorithm call. I didn't see this specific request. I think the request was to wait for any process in a specific subset to complete. And then of course, the "wait for any process."
Yeah, maybe nobody requested it, and it was just a subconscious personal desire of mine. I actually had a waitAll() function in my very first draft, as you may recall. It was only implemented for POSIX, and it had all the problems mentioned here wrt. processes not created by spawnProcess(), so you (rightly) convinced me to remove it. I even had a waitAny() function in there, which you also (again, rightly) didn't take kindly to. :)
 Either of these is possible similar to how core.thread keeps 
 track of all threads, we could simply ignore any child exits 
 that we didn't manage.  And I think it's worth implementing at 
 some point (this is not an easy problem to solve correctly), 
 but not for this release.
I agree, this will be difficult or impossible on POSIX. Calling wait() and then ignoring the processes we don't care about won't do, because then we've ruined the chance for other code to call wait() on those processes. Lars
Feb 24 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 10:53:48 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Sunday, 24 February 2013 at 15:37:10 UTC, Steven Schveighoffer wrote:
 Either of these is possible similar to how core.thread keeps track of  
 all threads, we could simply ignore any child exits that we didn't  
 manage.  And I think it's worth implementing at some point (this is not  
 an easy problem to solve correctly), but not for this release.
I agree, this will be difficult or impossible on POSIX. Calling wait() and then ignoring the processes we don't care about won't do, because then we've ruined the chance for other code to call wait() on those processes.
I think if we have this feature, there needs to be a big fat warning not to start child processes except through this library. We could provide a hook to call when an unknown child exits... -Steve
Feb 24 2013
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. 
Kyllingstad wrote:
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
Ok, a new version with non-blocking wait is up. Lars
Feb 24 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Feb-2013 21:41, Lars T. Kyllingstad пишет:
 On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. Kyllingstad wrote:
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
Ok, a new version with non-blocking wait is up.
asyncWait would be less verbose :) Also how about returning a "Future" object that you may block on some loong time later? -- Dmitry Olshansky
Feb 24 2013
next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Feb-2013 22:05, Dmitry Olshansky пишет:
 24-Feb-2013 21:41, Lars T. Kyllingstad пишет:
 On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. Kyllingstad wrote:
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
Ok, a new version with non-blocking wait is up.
asyncWait would be less verbose :)
 Also how about returning a "Future" object that you may block on some
 loong time later?
Oh wait, the Pid itself + blocking wait fits the bill. Then all is fine as is, sorry for the noise. -- Dmitry Olshansky
Feb 24 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 18:05:14 UTC, Dmitry Olshansky 
wrote:
 24-Feb-2013 21:41, Lars T. Kyllingstad пишет:
 On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. 
 Kyllingstad wrote:
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
Ok, a new version with non-blocking wait is up.
asyncWait would be less verbose :)
To me, "asynchronous" implies that something is going on in the background that will produce a result in the future. That is not what happens here. I agree that nonBlockingWait() is less than ideal, though, mainly because it is an oxymoron. :) I considered "status", "isAlive", etc., but I think it is important to emphasise the fact that if the process *has* terminated, nonBlockingWait() has the same, perhaps non-obvious, effects as wait(): On POSIX, it makes the OS clean up after the process. On Windows, it closes the process handle. On all platforms, it invalidates the processID and osHandle properties of the Pid object. If you or anyone else have a better suggestion, I'm all ears. Lars
Feb 24 2013
next sibling parent reply "jerro" <a a.com> writes:
 To me, "asynchronous" implies that something is going on in the 
 background that will produce a result in the future.  That is 
 not what happens here.

 I agree that nonBlockingWait() is less than ideal, though, 
 mainly because it is an oxymoron. :)  I considered "status", 
 "isAlive", etc., but I think it is important to emphasise the 
 fact that if the process *has* terminated, nonBlockingWait() 
 has the same, perhaps non-obvious, effects as wait():

 On POSIX, it makes the OS clean up after the process.
 On Windows, it closes the process handle.
 On all platforms, it invalidates the processID and osHandle 
 properties of the Pid object.

 If you or anyone else have a better suggestion, I'm all ears.

 Lars
Maybe tryWait?
Feb 24 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 18:56:39 UTC, jerro wrote:
 To me, "asynchronous" implies that something is going on in 
 the background that will produce a result in the future.  That 
 is not what happens here.

 I agree that nonBlockingWait() is less than ideal, though, 
 mainly because it is an oxymoron. :)  I considered "status", 
 "isAlive", etc., but I think it is important to emphasise the 
 fact that if the process *has* terminated, nonBlockingWait() 
 has the same, perhaps non-obvious, effects as wait():

 On POSIX, it makes the OS clean up after the process.
 On Windows, it closes the process handle.
 On all platforms, it invalidates the processID and osHandle 
 properties of the Pid object.

 If you or anyone else have a better suggestion, I'm all ears.

 Lars
Maybe tryWait?
I like it. :) Lars
Feb 24 2013
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
24-Feb-2013 22:42, Lars T. Kyllingstad пишет:
 On Sunday, 24 February 2013 at 18:05:14 UTC, Dmitry Olshansky wrote:
 24-Feb-2013 21:41, Lars T. Kyllingstad пишет:
 On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. Kyllingstad
 wrote:
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
Ok, a new version with non-blocking wait is up.
asyncWait would be less verbose :)
To me, "asynchronous" implies that something is going on in the background that will produce a result in the future. That is not what happens here. I agree that nonBlockingWait() is less than ideal, though, mainly because it is an oxymoron. :) I considered "status", "isAlive", etc., but I think it is important to emphasise the fact that if the process *has* terminated, nonBlockingWait() has the same, perhaps non-obvious, effects as wait():
detach or detachProcess maybe as a method on Pid struct. Then there is no need to handle status codes etc.
 On POSIX, it makes the OS clean up after the process.
 On Windows, it closes the process handle.
 On all platforms, it invalidates the processID and osHandle properties
 of the Pid object.

 If you or anyone else have a better suggestion, I'm all ears.

 Lars
-- Dmitry Olshansky
Feb 24 2013
prev sibling parent Lee Braiden <leebraid gmail.com> writes:
On Sun, 24 Feb 2013 19:42:23 +0100, Lars T. Kyllingstad wrote:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/
std_process2.html

 Ok, a new version with non-blocking wait is up.
asyncWait would be less verbose :)
To me, "asynchronous" implies that something is going on in the background that will produce a result in the future. That is not what happens here. I agree that nonBlockingWait() is less than ideal, though, mainly because it is an oxymoron. :) I considered "status", "isAlive", etc., but I think it is important to emphasise the fact that if the process *has* terminated, nonBlockingWait() has the same, perhaps non-obvious, effects as wait():
I think something like getExitStatus() or checkExitStatus() makes more sense in terms of what the function actually does. The only problem is that you lose the connection with wait(). wait is poorly named anyway though, it's only good because it's the traditional name. A combo like waitForExitStatus() and checkForExitStatus() would probably make more sense. Although I guess we're getting into java-style names, rather than C-style names ;) -- Lee
Feb 24 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 24 February 2013 at 17:41:44 UTC, Lars T. Kyllingstad 
wrote:
 Ok, a new version with non-blocking wait is up.
1. Can the Firefox example be replaced with something else? Spawning a specific browser to open a webpage is bad practice, and I've noticed in several programs. It would be nice not to perpetuate it any further. There exists a proper way to open an URL in the default browser, already implemented as browse(url) in the current std.process. 2. (Nitpick) The grep example uses a POSIX quoting syntax (single quotes). Would be better to use double quotes, or pass as array to avoid one more needless OS-specific element. 3. The documentation for the "gui" config item seems to be wrong: it prevents the creation of a console, instead of causing it. 4. I see that my command escaping functions have not made it through. I believe the matter has been discussed before, and I thought the consensus was to use them, although it's been a while. The function escapeShellCommand and its callees from the current std.process have the advantages that a) they come with a very thorough unit test, whereas std.process2's Windows escaping code does not have tests at all, and b) they are usable independently, which allows constructing scripts and batch files in D programs. 5. The spawnProcess versions which take a single string command simply call std.string.split on the command. I believe this approach to be fundamentally wrong, as passing an argument in quotes will not work as expected. Furthermore, on Windows (where process arguments are passed as a single string), splitting the string in spawnProcess and then putting it back together in spawnProcessWindows will result in a final command line that is different from the one supplied by the user. 6. What are the reasons why this module can't be integrated with the existing std.process? I've noticed it mentioned a few times but couldn't actually find the reasoning, anyone can post the link? 7. How do I test this with a recent version of Phobos? I'm getting the following runtime exception with the "ls" example: std.stdio.StdioException std\stdio.d(2343): Failed to pass stdin stream to child process ---------------- 0x0041D504 in char[][] core.sys.windows.stacktrace.StackTrace.trace() 0x0041D38F in core.sys.windows.stacktrace.StackTrace core.sys.windows.stacktrace.StackTrace.__ctor() 0x004124A8 in D3std8process219spawnPАЖПWindowsFNeAyaxAтPvSАД░5┌io4FileАРРАРРEАНр6ConfigZCАНЦ3Pid13prepa┘St┘amFKАР╔kАГ JiJPvZv 0x0040F8AA in trusted std.process2.Pid std.process2.spawnProcess(immutable(char)[], const(immutable(char)[][]), std.stdio.File, std.stdio.File, std.stdio.File, std.process2.Config) 0x0040A353 in trusted std.process2.Pid std.process2.spawnProcess(immutable(char)[], std.stdio.File, std.stdio.File, std.stdio.File, std.process2.Config) 0x004020B9 in _Dmain Do I need a patched snn.lib or something else?
Feb 24 2013
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/24/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 a) they come with a very thorough unit test
Unfortunately those unittests are never being run in the Phobos test-suite and as a result a regression was missed (which was fixed by now): http://d.puremagic.com/issues/show_bug.cgi?id=9309 The report for the unittests: http://d.puremagic.com/issues/show_bug.cgi?id=9310
Feb 24 2013
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 24 February 2013 at 21:40:24 UTC, Andrej Mitrovic 
wrote:
 On 2/24/13, Vladimir Panteleev <vladimir thecybershadow.net> 
 wrote:
 a) they come with a very thorough unit test
Unfortunately those unittests are never being run in the Phobos test-suite and as a result a regression was missed (which was fixed by now): http://d.puremagic.com/issues/show_bug.cgi?id=9309 The report for the unittests: http://d.puremagic.com/issues/show_bug.cgi?id=9310
Hmm, that's unfortunate. Some of that function's callees are tested, but not the function itself. The reason why unittest_burnin is not set in Phobos is that 1) it requires a helper program to be compiled beforehand, and 2) it runs indefinitely with random variations (hence its name). I suppose a simpler additional unit test would be appropriate. https://github.com/D-Programming-Language/phobos/pull/1161
Feb 24 2013
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 16:04:43 -0500, Vladimir Panteleev  =

<vladimir thecybershadow.net> wrote:

 On Sunday, 24 February 2013 at 17:41:44 UTC, Lars T. Kyllingstad wrote=
:
 Ok, a new version with non-blocking wait is up.
3. The documentation for the "gui" config item seems to be wrong: it =
 prevents the creation of a console, instead of causing it.
It means 'use gui mode' which means, don't create a console. I don't = consider a console window a gui.
 4. I see that my command escaping functions have not made it through. =
I =
 believe the matter has been discussed before, and I thought the  =
 consensus was to use them, although it's been a while. The function  =
 escapeShellCommand and its callees from the current std.process have t=
he =
 advantages that a) they come with a very thorough unit test, whereas  =
 std.process2's Windows escaping code does not have tests at all, and b=
) =
 they are usable independently, which allows constructing scripts and  =
 batch files in D programs.
I had also thought we were going to use those (that was the consensus I = = remember). It probably was just forgotten. Lars?
 6. What are the reasons why this module can't be integrated with the  =
 existing std.process? I've noticed it mentioned a few times but couldn=
't =
 actually find the reasoning, anyone can post the link?
Lars just mentioned his reasons in this thread. Let me see... http://forum.dlang.org/post/pnspeckullzedovpvjcx forum.dlang.org
 7. How do I test this with a recent version of Phobos? I'm getting the=
=
 following runtime exception with the "ls" example:

 std.stdio.StdioException std\stdio.d(2343): Failed to pass stdin strea=
m =
 to child process
 ----------------
 0x0041D504 in char[][] core.sys.windows.stacktrace.StackTrace.trace()
 0x0041D38F in core.sys.windows.stacktrace.StackTrace  =
 core.sys.windows.stacktrace.StackTrace.__ctor()
 0x004124A8 in  =
 D3std8process219spawnP=D0=90=D0=96=D0=9FWindowsFNeAyaxA=D1=82PvS=D0=90=
=D0=94=E2=96=915=E2=94=8Cio4File=D0=90=D0=A0=D0=A0=D0=90=D0=A0=D0=A0E=D0= =90=D0=9D=D1=806ConfigZC=D0=90=D0=9D=D0=A63Pid13prepa=E2=94=98St=E2=94=98= amFK=D0=90=D0=A0=E2=95=94k=D0=90=D0=93 =
 JiJPvZv
 0x0040F8AA in  trusted std.process2.Pid  =
 std.process2.spawnProcess(immutable(char)[], const(immutable(char)[][]=
), =
 std.stdio.File, std.stdio.File, std.stdio.File, std.process2.Config)
 0x0040A353 in  trusted std.process2.Pid  =
 std.process2.spawnProcess(immutable(char)[], std.stdio.File,  =
 std.stdio.File, std.stdio.File, std.process2.Config)
 0x004020B9 in _Dmain

 Do I need a patched snn.lib or something else?
No, snn.lib included with the compilers for a few versions has been = patched. The exeception you would get would be different, that appears = to = be coming from std.process, even though the file name seems to be = std.stdio. -Steve
Feb 24 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 24 February 2013 at 22:13:35 UTC, Steven Schveighoffer 
wrote:
 On Sun, 24 Feb 2013 16:04:43 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Sunday, 24 February 2013 at 17:41:44 UTC, Lars T. 
 Kyllingstad wrote:
 Ok, a new version with non-blocking wait is up.
3. The documentation for the "gui" config item seems to be wrong: it prevents the creation of a console, instead of causing it.
It means 'use gui mode' which means, don't create a console. I don't consider a console window a gui.
Sorry, I think you misunderstood. Currently, the documentation says: "On Windows, this option causes the process to run in a console window." However, when the flag is PRESENT, then the console window is SUPPRESSED (see line 522). The documentation's meaning is reversed.
 No, snn.lib included with the compilers for a few versions has 
 been patched.  The exeception you would get would be different, 
 that appears to be coming from std.process, even though the 
 file name seems to be std.stdio.
OK... then I guess it doesn't work for me. Does it work for anyone else on Windows? Full code of my ls test program: import std.stdio; import std.process2; void main() { string[] files; auto p = pipe(); auto pid = spawnProcess("ls", stdin, p.writeEnd); scope(exit) wait(pid); foreach (f; p.readEnd.byLine()) files ~= f.idup; }
Feb 24 2013
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/25/13, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 Full code of my ls test program:
Works for me on XP. Using 'ls' from GnuWin32.
Feb 24 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 18:52:04 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Sunday, 24 February 2013 at 22:13:35 UTC, Steven Schveighoffer wrote:
 On Sun, 24 Feb 2013 16:04:43 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Sunday, 24 February 2013 at 17:41:44 UTC, Lars T. Kyllingstad wrote:
 Ok, a new version with non-blocking wait is up.
3. The documentation for the "gui" config item seems to be wrong: it prevents the creation of a console, instead of causing it.
It means 'use gui mode' which means, don't create a console. I don't consider a console window a gui.
Sorry, I think you misunderstood. Currently, the documentation says: "On Windows, this option causes the process to run in a console window." However, when the flag is PRESENT, then the console window is SUPPRESSED (see line 522). The documentation's meaning is reversed.
Yes, you are right. It needs to be fixed. Thanks.
 No, snn.lib included with the compilers for a few versions has been  
 patched.  The exeception you would get would be different, that appears  
 to be coming from std.process, even though the file name seems to be  
 std.stdio.
OK... then I guess it doesn't work for me. Does it work for anyone else on Windows? Full code of my ls test program: import std.stdio; import std.process2; void main() { string[] files; auto p = pipe(); auto pid = spawnProcess("ls", stdin, p.writeEnd); scope(exit) wait(pid); foreach (f; p.readEnd.byLine()) files ~= f.idup; }
Hm... that message is printed out if the code cannot set the inherit handle flag on the specific stdin. Are you on windows 64 or 32? It's a large difference since one uses MSVCRT and one uses DMCRT. Also, I don't have windows 64, so I can't verify this if that's the case :) -Steve
Feb 24 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 00:02:54 UTC, Steven Schveighoffer 
wrote:
 Hm... that message is printed out if the code cannot set the 
 inherit handle flag on the specific stdin.

 Are you on windows 64 or 32?  It's a large difference since one 
 uses MSVCRT and one uses DMCRT.  Also, I don't have windows 64, 
 so I can't verify this if that's the case :)
I'm getting the same exception with DMD64 and DMD32.
Feb 24 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 19:17:44 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 00:02:54 UTC, Steven Schveighoffer wrote:
 Hm... that message is printed out if the code cannot set the inherit  
 handle flag on the specific stdin.

 Are you on windows 64 or 32?  It's a large difference since one uses  
 MSVCRT and one uses DMCRT.  Also, I don't have windows 64, so I can't  
 verify this if that's the case :)
I'm getting the same exception with DMD64 and DMD32.
Are you running from a console? If not, I think I see where the issue is. -Steve
Feb 24 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 00:27:10 UTC, Steven Schveighoffer 
wrote:
 On Sun, 24 Feb 2013 19:17:44 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 00:02:54 UTC, Steven 
 Schveighoffer wrote:
 Hm... that message is printed out if the code cannot set the 
 inherit handle flag on the specific stdin.

 Are you on windows 64 or 32?  It's a large difference since 
 one uses MSVCRT and one uses DMCRT.  Also, I don't have 
 windows 64, so I can't verify this if that's the case :)
I'm getting the same exception with DMD64 and DMD32.
Are you running from a console? If not, I think I see where the issue is.
I am running from a console, and I don't think this would make any difference. Maybe you intended to ask if I'm linking with /SUBSYSTEM:WINDOWS? (I'm not)
Feb 24 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 19:35:36 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 00:27:10 UTC, Steven Schveighoffer wrote:
 On Sun, 24 Feb 2013 19:17:44 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 00:02:54 UTC, Steven Schveighoffer  
 wrote:
 Hm... that message is printed out if the code cannot set the inherit  
 handle flag on the specific stdin.

 Are you on windows 64 or 32?  It's a large difference since one uses  
 MSVCRT and one uses DMCRT.  Also, I don't have windows 64, so I can't  
 verify this if that's the case :)
I'm getting the same exception with DMD64 and DMD32.
Are you running from a console? If not, I think I see where the issue is.
I am running from a console, and I don't think this would make any difference. Maybe you intended to ask if I'm linking with /SUBSYSTEM:WINDOWS? (I'm not)
So here is the code that is throwing that exception: static void prepareStream(ref File file, DWORD stdHandle, string which, out int fileDescriptor, out HANDLE handle) { fileDescriptor = _fileno(file.getFP()); if (fileDescriptor < 0) handle = GetStdHandle(stdHandle); else { version (DMC_RUNTIME) handle = _fdToHandle(fileDescriptor); else /* MSVCRT */ handle = _get_osfhandle(fileDescriptor); } if (!SetHandleInformation(handle, HANDLE_FLAG_INHERIT, HANDLE_FLAG_INHERIT)) { throw new StdioException( "Failed to pass "~which~" stream to child process", 0); } } Called like this: prepareStream(stdin_, STD_INPUT_HANDLE, "stdin" , stdinFD, startinfo.hStdInput ); Since "stdin" is what is in the exception. It looks like you are passing stdin as the handle for stdin. From the above, the ways this exception could fail are: 1. The file descriptor from stdin failed to come out, and windows gives back a valid handle from GetStdHandle 2. The file descriptor is valid (0 or above), but _fdToHandle/_get_osfhandle fails to get a valid handle 3. We have a valid handle, but for some reason SetHandleInformation fails. I'm guessing that since you are running with a normal subsystem, with a console, you have a valid handle. So my guess would be that SetHandleInformation is failing. Can you catch the exception and print out GetLastError()? -Steve
Feb 24 2013
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 00:44:43 UTC, Steven Schveighoffer 
wrote:
 Can you catch the exception and print out GetLastError()?
87 (ERROR_INVALID_PARAMETER): The parameter is incorrect. Consider using FileException, which calls sysErrorString.
Feb 24 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 00:44:43 UTC, Steven Schveighoffer 
wrote:
 1. The file descriptor from stdin failed to come out, and 
 windows gives back a valid handle from GetStdHandle
 2. The file descriptor is valid (0 or above), but 
 _fdToHandle/_get_osfhandle fails to get a valid handle
fileDescriptor is 0. The handle is 3. GetStdHandle(STD_INPUT_HANDLE) is also 3.
 3. We have a valid handle, but for some reason 
 SetHandleInformation fails.
Looks like it. Maybe you can't SetHandleInformation on standard handles in Windows 7?
Feb 24 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 19:57:41 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 00:44:43 UTC, Steven Schveighoffer wrote:
 1. The file descriptor from stdin failed to come out, and windows gives  
 back a valid handle from GetStdHandle
 2. The file descriptor is valid (0 or above), but  
 _fdToHandle/_get_osfhandle fails to get a valid handle
fileDescriptor is 0. The handle is 3. GetStdHandle(STD_INPUT_HANDLE) is also 3.
 3. We have a valid handle, but for some reason SetHandleInformation  
 fails.
Looks like it. Maybe you can't SetHandleInformation on standard handles in Windows 7?
I suppose that is possible. By default the normal stdin is used, so maybe the OS makes the decision on inheritance based on whether it is used or not. Clearly it works for some people... Did you try SetHandleInformation directly on that handle? It's still slightly possible that some other call caused the exception while this one was being thrown. -Steve
Feb 24 2013
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 01:09:32 UTC, Steven Schveighoffer
wrote:
 Did you try SetHandleInformation directly on that handle?  It's 
 still slightly possible that some other call caused the 
 exception while this one was being thrown.
I put a writeln in the if, so I'm quite sure that it was SetHandleInformation that failed.
Feb 24 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 00:57:42 UTC, Vladimir Panteleev 
wrote:
 Looks like it. Maybe you can't SetHandleInformation on standard 
 handles in Windows 7?
GetHandleInformation reveals that HANDLE_FLAG_INHERIT is already set for the stdin handle for me.
Feb 24 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 01:10:08 UTC, Vladimir Panteleev 
wrote:
 On Monday, 25 February 2013 at 00:57:42 UTC, Vladimir Panteleev 
 wrote:
 Looks like it. Maybe you can't SetHandleInformation on 
 standard handles in Windows 7?
GetHandleInformation reveals that HANDLE_FLAG_INHERIT is already set for the stdin handle for me.
This fixes it for me: http://dump.thecybershadow.net/2418951ca5eea369fb9f84d0514aa5e3/aoeu.patch
Feb 24 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 24 Feb 2013 20:15:02 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 01:10:08 UTC, Vladimir Panteleev wrote:
 On Monday, 25 February 2013 at 00:57:42 UTC, Vladimir Panteleev wrote:
 Looks like it. Maybe you can't SetHandleInformation on standard  
 handles in Windows 7?
GetHandleInformation reveals that HANDLE_FLAG_INHERIT is already set for the stdin handle for me.
This fixes it for me: http://dump.thecybershadow.net/2418951ca5eea369fb9f84d0514aa5e3/aoeu.patch
OK, we'll make this change. -Steve
Feb 24 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 01:20:53 UTC, Steven Schveighoffer 
wrote:
 On Sun, 24 Feb 2013 20:15:02 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 01:10:08 UTC, Vladimir 
 Panteleev wrote:
 On Monday, 25 February 2013 at 00:57:42 UTC, Vladimir 
 Panteleev wrote:
 Looks like it. Maybe you can't SetHandleInformation on 
 standard handles in Windows 7?
GetHandleInformation reveals that HANDLE_FLAG_INHERIT is already set for the stdin handle for me.
This fixes it for me: http://dump.thecybershadow.net/2418951ca5eea369fb9f84d0514aa5e3/aoeu.patch
OK, we'll make this change.
Done. Thanks for tracking this down, guys! Vladimir, could you just verify that it works with the code I pushed just now? Lars
Feb 24 2013
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 06:46:32 UTC, Lars T. Kyllingstad 
wrote:
 On Monday, 25 February 2013 at 01:20:53 UTC, Steven 
 Schveighoffer wrote:
 On Sun, 24 Feb 2013 20:15:02 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Monday, 25 February 2013 at 01:10:08 UTC, Vladimir 
 Panteleev wrote:
 On Monday, 25 February 2013 at 00:57:42 UTC, Vladimir 
 Panteleev wrote:
 Looks like it. Maybe you can't SetHandleInformation on 
 standard handles in Windows 7?
GetHandleInformation reveals that HANDLE_FLAG_INHERIT is already set for the stdin handle for me.
This fixes it for me: http://dump.thecybershadow.net/2418951ca5eea369fb9f84d0514aa5e3/aoeu.patch
OK, we'll make this change.
Done. Thanks for tracking this down, guys! Vladimir, could you just verify that it works with the code I pushed just now?
Yep, works now.
Feb 25 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 24 February 2013 at 21:04:45 UTC, Vladimir Panteleev 
wrote:
 On Sunday, 24 February 2013 at 17:41:44 UTC, Lars T. 
 Kyllingstad wrote:
 Ok, a new version with non-blocking wait is up.
1. Can the Firefox example be replaced with something else? Spawning a specific browser to open a webpage is bad practice, and I've noticed in several programs. It would be nice not to perpetuate it any further. There exists a proper way to open an URL in the default browser, already implemented as browse(url) in the current std.process.
Sure, I can think of another example. But I wouldn't read too much into this one; it was never meant as a demonstration of the "correct" way to open a web page. It was just a simple example of spawnProcess() usage that uses a cross-platform application everyone's heard of. After all, you *could* argue this way about almost any kind of application which wasn't just invented for the sake of the example. (In the last one, shouldn't we open the user's preferred word processor, etc?)
 2. (Nitpick) The grep example uses a POSIX quoting syntax 
 (single quotes). Would be better to use double quotes, or pass 
 as array to avoid one more needless OS-specific element.
Actually, the quotes can just be removed altogether. I believe this is an old example, BTW, from the module's infancy, when it was POSIX-only. If anyone has a good idea for sample code which will be familiar to users of all platforms, please speak up.
 3. The documentation for the "gui" config item seems to be 
 wrong: it prevents the creation of a console, instead of 
 causing it.
I've fixed it now.
 4. I see that my command escaping functions have not made it 
 through. I believe the matter has been discussed before, and I 
 thought the consensus was to use them, although it's been a 
 while. The function escapeShellCommand and its callees from the 
 current std.process have the advantages that a) they come with 
 a very thorough unit test, whereas std.process2's Windows 
 escaping code does not have tests at all, and b) they are 
 usable independently, which allows constructing scripts and 
 batch files in D programs.
They were indeed supposed to be used in the new std.process, it just slipped my mind. I don't have time to fix it right now, but I'll do it at some point. Personally, I don't think they should be part of the public API. They are inherently platform-specific, and we've tried to keep the module as platform-agnostic as possible. Besides, they are not really usable with any of the other functions, and I am afraid it will be interpreted that way if we make them public. How about we put them somewhere in the std.windows package? (std.windows.util, for example?)
 5. The spawnProcess versions which take a single string command 
 simply call std.string.split on the command. I believe this 
 approach to be fundamentally wrong, as passing an argument in 
 quotes will not work as expected. Furthermore, on Windows 
 (where process arguments are passed as a single string), 
 splitting the string in spawnProcess and then putting it back 
 together in spawnProcessWindows will result in a final command 
 line that is different from the one supplied by the user.
The whole point was to avoid any kind of arcane platform-specific quoting and escaping rules. If you have spaces inside your command line arguments, use the functions that take an array, and spawnProcess() will properly quote them. If you have any other funky characters in there, don't worry about it, spawnProcess() will properly escape them. The way it is now, the rules (if you can call it that) are exceedingly simple, and they are the same on all platforms. This has the added benefit of discouraging platform-dependent client code. Lars
Feb 24 2013
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 06:41:32 UTC, Lars T. Kyllingstad 
wrote:
 Sure, I can think of another example.  But I wouldn't read too 
 much into this one; it was never meant as a demonstration of 
 the "correct" way to open a web page.   It was just a simple 
 example of spawnProcess() usage that uses a cross-platform 
 application everyone's heard of.

 After all, you *could* argue this way about almost any kind of 
 application which wasn't just invented for the sake of the 
 example.  (In the last one, shouldn't we open the user's 
 preferred word processor, etc?)
The question is, what is the intent? Is it to just open some URL, or to specifically start Firefox? The same applies to the word processor case - if the document is in a file format understood by several applications, is the intent to simply open the document, or to open the document in that specific application? Now, the documentation clearly says that the example specifically launches Firefox. However, that doesn't mean that someone won't reach out for that example when hastily putting together an application that needs to open an URL. After all, it's at the top of the file, and they may not even know about the existence of the browse function which actually does what they intend. How about using "lynx -dump http://dlang.org/"? Dumping a text representation of a webpage is a feature specific to lynx, so the intent is clearer.
 2. (Nitpick) The grep example uses a POSIX quoting syntax 
 (single quotes). Would be better to use double quotes, or pass 
 as array to avoid one more needless OS-specific element.
Actually, the quotes can just be removed altogether.
OK, and now it's worse: your example uses syntax that's specific to std.process2. If you type that command in the shell, you'll get different behavior (the backslash will escape the . as a shell escape, not a RE escape).
 If anyone has a good idea for sample code which will be 
 familiar to users of all platforms, please speak up.
If we restrict ourselves to programs that would already work for all users, there's not much to pick from: the standard Windows and POSIX command-line utilities barely overlap, although there's also the programs included with D. Maybe include dmd, rdmd or dman in some examples?
 Personally, I don't think they should be part of the public 
 API.  They are inherently platform-specific, and we've tried to 
 keep the module as platform-agnostic as possible.
Constructing scripts is bound to be platform-specific. The current module version allows constructing batch files on POSIX. Here's a practical use case example for this feature: DMD uses the same syntax for response files on all platforms, and it follows the Windows command-line parsing rules. Currently, rdmd uses escapeWindowsArgument to build that response file on all platforms.
 Besides, they are not really usable with any of the other 
 functions, and I am afraid it will be interpreted that way if 
 we make them public.
This is actually a design problem in the new module, which I haven't discussed yet. Have a look at the very last example in the current std.process docs. How do you accomplish that correctly in the new version, without manually piping the inputs yourself? You can't.
 5. The spawnProcess versions which take a single string 
 command simply call std.string.split on the command. I believe 
 this approach to be fundamentally wrong, as passing an 
 argument in quotes will not work as expected. Furthermore, on 
 Windows (where process arguments are passed as a single 
 string), splitting the string in spawnProcess and then putting 
 it back together in spawnProcessWindows will result in a final 
 command line that is different from the one supplied by the 
 user.
The whole point was to avoid any kind of arcane platform-specific quoting and escaping rules. If you have spaces inside your command line arguments, use the functions that take an array, and spawnProcess() will properly quote them. If you have any other funky characters in there, don't worry about it, spawnProcess() will properly escape them. The way it is now, the rules (if you can call it that) are exceedingly simple, and they are the same on all platforms. This has the added benefit of discouraging platform-dependent client code.
OK, then picture the following situation. A user of the new module starts using the module, and invokes a specific command using the spawnProcess overload that takes it as a single string. Convenient, right? Then, as the program evolves, the string becomes an enum, then a config variable, which the user can adjust. Then, a end-user tries setting the config variable to a path that contains spaces, and everything breaks. Wrapping the path in quotes does not help either. Due to the way the function is designed, it is IMPOSSIBLE for the end-user to configure the application to launch a program located at a path containing spaces. To end-users, this comes off as a classical problem in badly written applications that don't handle command-line escaping properly. This problem is as with any case of an interface which works in simple cases, but behaves unexpectedly in more complicated cases: it is bad design (convenience or not), and must be avoided. I suggest that either the overloads which take a single string be removed, or that they spawn a shell instead, and let the shell do the command-line splitting. Together with my command and filename escaping functions, they should allow the user to achieve any combination of executing commands with arbitrary punctuation in the program path or arguments, as well as redirecting the output to files (again, with correctly-escaped filenames) or other programs using the existing shell syntax present on both platforms.
Feb 25 2013
next sibling parent "Jakob Ovrum" <jakobovrum gmail.com> writes:
On Monday, 25 February 2013 at 15:07:47 UTC, Vladimir Panteleev 
wrote:
 I suggest that either the overloads which take a single string 
 be removed, or that they spawn a shell instead, and let the 
 shell do the command-line splitting. Together with my command 
 and filename escaping functions, they should allow the user to 
 achieve any combination of executing commands with arbitrary 
 punctuation in the program path or arguments, as well as 
 redirecting the output to files (again, with correctly-escaped 
 filenames) or other programs using the existing shell syntax 
 present on both platforms.
I concur that they should be removed. If the user wants the behaviour of split(), the user can use split() explicitly and the serious implications of that will be out in the open rather than buried in standard library source code and documentation.
Feb 25 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 15:07:47 UTC, Vladimir Panteleev 
wrote:
 On Monday, 25 February 2013 at 06:41:32 UTC, Lars T. 
 Kyllingstad wrote:
 Sure, I can think of another example.  But I wouldn't read too 
 much into this one; it was never meant as a demonstration of 
 the "correct" way to open a web page.   It was just a simple 
 example of spawnProcess() usage that uses a cross-platform 
 application everyone's heard of.

 After all, you *could* argue this way about almost any kind of 
 application which wasn't just invented for the sake of the 
 example.  (In the last one, shouldn't we open the user's 
 preferred word processor, etc?)
The question is, what is the intent? Is it to just open some URL, or to specifically start Firefox? The same applies to the word processor case - if the document is in a file format understood by several applications, is the intent to simply open the document, or to open the document in that specific application? Now, the documentation clearly says that the example specifically launches Firefox. However, that doesn't mean that someone won't reach out for that example when hastily putting together an application that needs to open an URL. After all, it's at the top of the file, and they may not even know about the existence of the browse function which actually does what they intend. How about using "lynx -dump http://dlang.org/"? Dumping a text representation of a webpage is a feature specific to lynx, so the intent is clearer.
That is also incredibly obscure. I'd venture a guess that only ~10% of D's user base have even heard of Lynx. Everyone knows firefox, and will understand what the example is supposed to illustrate. (I admit that the ls/grep examples will also be rather incomprehensible to someone not familiar with the *NIX command line, and I will replace them with something else. The D toolchain, as you suggest below, is a very good idea.) BTW, browse() should never have been added to std.process, in my opinion. Maybe to some other utility module, but then it should at least be done right, and be properly documented. What does it actually do? There is no way to tell unless you read the source. (And then, it turns out that it spawns a new process for the browser and returns immediately, but it does not return a process ID that you can poll or wait for. Bah.)
 2. (Nitpick) The grep example uses a POSIX quoting syntax 
 (single quotes). Would be better to use double quotes, or 
 pass as array to avoid one more needless OS-specific element.
Actually, the quotes can just be removed altogether.
OK, and now it's worse: your example uses syntax that's specific to std.process2. If you type that command in the shell, you'll get different behavior (the backslash will escape the . as a shell escape, not a RE escape).
[I am going to let slip here that you almost have me convinced with many of your arguments below, but I am still going to play devil's advocate for a bit.] It is not worse. It is a lot simpler, because the programmer does not need to know anything about the underlying platform. They only need to know one rule: If your arguments contain spaces, use the array functions. I don't think the generic process-spawning functions in std.process should be bound by, or tied to, the syntax of whatever shell the programmer (or the end user) prefers. [...]
 Personally, I don't think they should be part of the public 
 API.  They are inherently platform-specific, and we've tried 
 to keep the module as platform-agnostic as possible.
Constructing scripts is bound to be platform-specific. The current module version allows constructing batch files on POSIX. Here's a practical use case example for this feature: DMD uses the same syntax for response files on all platforms, and it follows the Windows command-line parsing rules. Currently, rdmd uses escapeWindowsArgument to build that response file on all platforms.
Point taken.
 Besides, they are not really usable with any of the other 
 functions, and I am afraid it will be interpreted that way if 
 we make them public.
This is actually a design problem in the new module, which I haven't discussed yet. Have a look at the very last example in the current std.process docs. How do you accomplish that correctly in the new version, without manually piping the inputs yourself? You can't.
I grudgingly admit that this is true.
 [...]

 The way it is now, the rules (if you can call it that) are 
 exceedingly simple, and they are the same on all platforms.  
 This has the added benefit of discouraging platform-dependent 
 client code.
OK, then picture the following situation. A user of the new module starts using the module, and invokes a specific command using the spawnProcess overload that takes it as a single string. Convenient, right? Then, as the program evolves, the string becomes an enum, then a config variable, which the user can adjust. Then, a end-user tries setting the config variable to a path that contains spaces, and everything breaks. Wrapping the path in quotes does not help either. Due to the way the function is designed, it is IMPOSSIBLE for the end-user to configure the application to launch a program located at a path containing spaces. To end-users, this comes off as a classical problem in badly written applications that don't handle command-line escaping properly.
Exposing the specifics of whatever programming language you are using to the end user? I would just call that bad application programming. If anything, you should be using one of the 'shell' functions in this case, not spawnProcess.
 This problem is as with any case of an interface which works in 
 simple cases, but behaves unexpectedly in more complicated 
 cases: it is bad design (convenience or not), and must be 
 avoided.

 I suggest that either the overloads which take a single string 
 be removed, or that they spawn a shell instead, and let the 
 shell do the command-line splitting. Together with my command 
 and filename escaping functions, they should allow the user to 
 achieve any combination of executing commands with arbitrary 
 punctuation in the program path or arguments, as well as 
 redirecting the output to files (again, with correctly-escaped 
 filenames) or other programs using the existing shell syntax 
 present on both platforms.
You almost have me convinced that the single-string non-shell functions must go. In the case of pipeProcess() and execute(), pipeShell() and shell() do the same job, with (arguably, still) less surprises. Maybe it would then be a good idea to add a spawnShell() function to go with spawnProcess(). The escape*() functions need to be better documented. Am I correct that they quote according to 'cmd.exe' rules on Windows and 'sh' rules on POSIX? Lars
Feb 25 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Monday, 25 February 2013 at 20:06:19 UTC, Lars T. Kyllingstad 
wrote:
 That is also incredibly obscure.  I'd venture a guess that only 
 ~10% of D's user base have even heard of Lynx.  Everyone knows 
 firefox, and will understand what the example is supposed to 
 illustrate.  (I admit that the ls/grep examples will also be 
 rather incomprehensible to someone not familiar with the *NIX 
 command line, and I will replace them with something else.  The 
 D toolchain, as you suggest below, is a very good idea.)
I still think using Firefox is a bad idea, but I've already presented my arguments.
 BTW, browse() should never have been added to std.process, in 
 my opinion.  Maybe to some other utility module, but then it 
 should at least be done right, and be properly documented.
What would you improve about it? I have no opinion on its location in Phobos, but std.process is the most fitting one if you don't consider creating a new module.
 What does it actually do?  There is no way to tell unless you 
 read the source.
I don't see why the documentation needs to be burdened with implementation details, other than perhaps mentioning that it returns immediately. The implementation is rather OS-specific... if we find out that there is a better way of accomplishing the task on a given platform, the documentation would need to be updated. Isn't the documentation considered part of the contract with the library user?
  (And then, it turns out that it spawns a new process for the 
 browser and returns immediately, but it does not return a 
 process ID that you can poll or wait for.  Bah.)
That's exactly how it should work... due to technical reasons. Most browsers will communicate the need to open a web page in a new tab to an existing instance and exit immediately, if there is an existing instance. There is no practical way to wait for the user to close the web browser.
 It is not worse.  It is a lot simpler, because the programmer 
 does not need to know anything about the underlying platform.  
 They only need to know one rule:  If your arguments contain 
 spaces, use the array functions.  I don't think the generic 
 process-spawning functions in std.process should be bound by, 
 or tied to, the syntax of whatever shell the programmer (or the 
 end user) prefers.
Yes, I agree that tying it to a certain shell syntax is bad. However, introducing a third syntax incompatible with either two major ones, which additionally is less expressive, is IMHO worse. The "universal" way to distinguish / quote arguments is to use an array.
 If your arguments contain spaces, use the array functions.
Not just arguments: programs as well. On Windows, all third-party software is expected to install itself under the "Program Files" directory. Unless whatever you're launching is expected to be in PATH, splitting the string by spaces won't get you far on Windows.
 Then, a end-user tries setting the config variable to a path 
 that contains spaces, and everything breaks. Wrapping the path 
 in quotes does not help either. Due to the way the function is 
 designed, it is IMPOSSIBLE for the end-user to configure the 
 application to launch a program located at a path containing 
 spaces. To end-users, this comes off as a classical problem in 
 badly written applications that don't handle command-line 
 escaping properly.
Exposing the specifics of whatever programming language you are using to the end user?
I don't understand what you mean here. It's not exposing any specifics if you don't implement anything in a way specific to D.
 I would just call that bad application programming.  If 
 anything, you should be using one of the 'shell' functions in 
 this case, not spawnProcess.
Yes, if the user is expected to customize the arguments as well. Otherwise it could very well be the spawnProcess overload that takes an array of arguments.
 You almost have me convinced that the single-string non-shell 
 functions must go.  In the case of pipeProcess() and execute(), 
 pipeShell() and shell() do the same job, with (arguably, still) 
 less surprises.  Maybe it would then be a good idea to add a 
 spawnShell() function to go with spawnProcess().
How about something that converts a ProcessPipes instance into a Tuple!(int, "status", string, "output") as returned by shell? Then you could do e.g. auto result = shell("command").collectOutput(); // use result.status or result.output
 The escape*() functions need to be better documented.  Am I 
 correct that they quote according to 'cmd.exe' rules on Windows 
 and 'sh' rules on POSIX?
Yes. On Windows, though, the rules are defined by CommandLineToArgvW for splitting/escaping the individual arguments, and cmd.exe for the whole string when it's passed to the shell. Check the reference URLs in the comments in escapeWindowsArgumentImpl.
Feb 25 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 21:06:54 UTC, Vladimir Panteleev 
wrote:
 On Monday, 25 February 2013 at 20:06:19 UTC, Lars T. 
 Kyllingstad wrote:
 That is also incredibly obscure.  I'd venture a guess that 
 only ~10% of D's user base have even heard of Lynx.  Everyone 
 knows firefox, and will understand what the example is 
 supposed to illustrate.  (I admit that the ls/grep examples 
 will also be rather incomprehensible to someone not familiar 
 with the *NIX command line, and I will replace them with 
 something else.  The D toolchain, as you suggest below, is a 
 very good idea.)
I still think using Firefox is a bad idea, but I've already presented my arguments.
 BTW, browse() should never have been added to std.process, in 
 my opinion.  Maybe to some other utility module, but then it 
 should at least be done right, and be properly documented.
What would you improve about it?
1. I would document it properly. 2. As long as it runs in the background, I would return some kind of process ID from it. (Yes, most browsers today may just signal another instance to open a new tab and then return, but would be surprised if they *all* do.) (3. Maybe put it in a different module, I'm not sure.) Also, and this is of course extremely subjective, it just looks out of place and very much "alone". Where is writeEmailInDefaultClient(address)? Where is openInAssociatedApp(file)?
 I have no opinion on its location in Phobos, but std.process is 
 the most fitting one if you don't consider creating a new 
 module.
Maybe. [...]
 Then, a end-user tries setting the config variable to a path 
 that contains spaces, and everything breaks. Wrapping the 
 path in quotes does not help either. Due to the way the 
 function is designed, it is IMPOSSIBLE for the end-user to 
 configure the application to launch a program located at a 
 path containing spaces. To end-users, this comes off as a 
 classical problem in badly written applications that don't 
 handle command-line escaping properly.
Exposing the specifics of whatever programming language you are using to the end user?
I don't understand what you mean here. It's not exposing any specifics if you don't implement anything in a way specific to D.
You pass a string directly from a config file into a rather low-level function in the programming language's standard library without any kind of validation. [...]
 You almost have me convinced that the single-string non-shell 
 functions must go.  In the case of pipeProcess() and 
 execute(), pipeShell() and shell() do the same job, with 
 (arguably, still) less surprises.  Maybe it would then be a 
 good idea to add a spawnShell() function to go with 
 spawnProcess().
How about something that converts a ProcessPipes instance into a Tuple!(int, "status", string, "output") as returned by shell? Then you could do e.g. auto result = shell("command").collectOutput(); // use result.status or result.output
That's pretty elegant. :) Lars
Feb 25 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 February 2013 at 07:17:49 UTC, Lars T. Kyllingstad 
wrote:
 On Monday, 25 February 2013 at 21:06:54 UTC, Vladimir Panteleev 
 wrote:
 On Monday, 25 February 2013 at 20:06:19 UTC, Lars T. 
 Kyllingstad wrote:
 That is also incredibly obscure.  I'd venture a guess that 
 only ~10% of D's user base have even heard of Lynx.  Everyone 
 knows firefox, and will understand what the example is 
 supposed to illustrate.  (I admit that the ls/grep examples 
 will also be rather incomprehensible to someone not familiar 
 with the *NIX command line, and I will replace them with 
 something else.  The D toolchain, as you suggest below, is a 
 very good idea.)
I still think using Firefox is a bad idea, but I've already presented my arguments.
 BTW, browse() should never have been added to std.process, in 
 my opinion.  Maybe to some other utility module, but then it 
 should at least be done right, and be properly documented.
What would you improve about it?
1. I would document it properly. 2. As long as it runs in the background, I would return some kind of process ID from it. (Yes, most browsers today may just signal another instance to open a new tab and then return, but would be surprised if they *all* do.) (3. Maybe put it in a different module, I'm not sure.)
4. I would design it so that if I do browse("foo.txt") it opens foo.txt in the web browser. Correct me if I'm wrong, but it currently seems that it will open it in the user's text editor on Windows. (On POSIX systems, too, if $BROWSER isn't set.) 1a. I would document that it uses $BROWSER on POSIX, as that is not even remotely standard. (It is not set on my Ubuntu 11.10 machine, for instance.) I would document that it uses xdg-open on Linux if $BROWSER is not set, as xdg-open will only exist on systems that conform to FreeDesktop standards. Even if it turns out that browse() is the "best" way to implement this feature, it is not perfect by a long shot, and as such its shortcomings should be well documented. Lars
Feb 25 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 07:17:49 UTC, Lars T. Kyllingstad 
wrote:
 What would you improve about it?
1. I would document it properly. 2. As long as it runs in the background, I would return some kind of process ID from it. (Yes, most browsers today may just signal another instance to open a new tab and then return, but would be surprised if they *all* do.)
Pretty sure all major Windows browsers do. The result is useless anyway. The user can browse away to another website, and if the browser is tabbed, open a completely different website and close the tab containing your website. What the user does from that moment is not the program's business. If it's important to detect when your website/webapp's tab is closed, I'd suggest implementing a long-polling request (COMET) and acting on when the connection is interrupted and there is no reconnection in 5 seconds.
 (3. Maybe put it in a different module, I'm not sure.)

 Also, and this is of course extremely subjective, it just looks 
 out of place and very much "alone".  Where is 
 writeEmailInDefaultClient(address)?  Where is 
 openInAssociatedApp(file)?
I guess no one simply wrote them yet? writeEmailInDefaultClient(address) is accomplished by opening the "mailto:"~address URL. The matter of protocols other than http needs some attention, though...
 4. I would design it so that if I do browse("foo.txt") it opens 
 foo.txt in the web browser.  Correct me if I'm wrong, but it 
 currently seems that it will open it in the user's text editor 
 on Windows.  (On POSIX systems, too, if $BROWSER isn't set.)
I don't know how you would accomplish that on Windows, without accessing the association in the OS registry for e.g. the http protocol. Might be better to change the documentation instead.
 Exposing the specifics of whatever programming language you 
 are using to the end user?
I don't understand what you mean here. It's not exposing any specifics if you don't implement anything in a way specific to D.
You pass a string directly from a config file into a rather low-level function in the programming language's standard library without any kind of validation.
Running a program from another program should be that simple. I don't think anyone expects to validate a command-line.
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 07:20:51 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 07:17:49 UTC, Lars T. Kyllingstad wrote:
 4. I would design it so that if I do browse("foo.txt") it opens foo.txt  
 in the web browser.  Correct me if I'm wrong, but it currently seems  
 that it will open it in the user's text editor on Windows.  (On POSIX  
 systems, too, if $BROWSER isn't set.)
I don't know how you would accomplish that on Windows, without accessing the association in the OS registry for e.g. the http protocol. Might be better to change the documentation instead.
shell("start foo.txt"); At least, I think this would work ;) -Steve
Feb 26 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 14:08:34 UTC, Steven 
Schveighoffer wrote:
 On Tue, 26 Feb 2013 07:20:51 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 07:17:49 UTC, Lars T. 
 Kyllingstad wrote:
 4. I would design it so that if I do browse("foo.txt") it 
 opens foo.txt in the web browser.  Correct me if I'm wrong, 
 but it currently seems that it will open it in the user's 
 text editor on Windows.  (On POSIX systems, too, if $BROWSER 
 isn't set.)
I don't know how you would accomplish that on Windows, without accessing the association in the OS registry for e.g. the http protocol. Might be better to change the documentation instead.
shell("start foo.txt"); At least, I think this would work ;)
No, start uses the same function, ShellExecute. It will open whatever is associated with .txt files, a text editor probably.
Feb 26 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 09:13:01 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 14:08:34 UTC, Steven Schveighoffer wrote:
 On Tue, 26 Feb 2013 07:20:51 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 07:17:49 UTC, Lars T. Kyllingstad  
 wrote:
 4. I would design it so that if I do browse("foo.txt") it opens  
 foo.txt in the web browser.  Correct me if I'm wrong, but it  
 currently seems that it will open it in the user's text editor on  
 Windows.  (On POSIX systems, too, if $BROWSER isn't set.)
I don't know how you would accomplish that on Windows, without accessing the association in the OS registry for e.g. the http protocol. Might be better to change the documentation instead.
shell("start foo.txt"); At least, I think this would work ;)
No, start uses the same function, ShellExecute. It will open whatever is associated with .txt files, a text editor probably.
Oh, I thought that was the desired behavior, I misread the above post... -Steve
Feb 26 2013
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 2/25/13, Lars T. Kyllingstad <public kyllingen.net> wrote:
 Personally, I don't think they should be part of the public API.
They're extremely useful, especially when you have to deal with Optlink or other software on win32.
 How about we put them somewhere in the std.windows package?
 (std.windows.util, for example?)
You can, but they should be public and available to all system (that means no version(Windows) shenanigans). E.g. someone might want to build a cross-platform build tool and invoke the compiler/linker manually (even if it's just an app sending commands remotely but running on a Posix system), in that case these escaping functions are very useful to have.
 If you have spaces inside your
 command line arguments, use the functions that take an array, and
 spawnProcess() will properly quote them.
What if we get a command from somewhere else as a single string and don't know if it has spaces?
Feb 25 2013
prev sibling next sibling parent reply "nazriel" <spam dzfl.pl> writes:
On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. 
Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :)  
 The upshot is that the module has actually seen active use over 
 those years, both by yours truly and others, so hopefully the 
 worst wrinkles are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

 I hope we can get it reviewed in time for the next release.  
 (The wiki page indicates that both std.benchmark and std.uni 
 are currently being reviewed, but I fail to find any "official" 
 review threads on the forum.  Is the wiki just out of date?)

 Lars
Very nice! Good job folks. Got question, sorry if it was asked before. Is there any way to call some functions after fork but before execve? Somekind of callback approach. It would be required to implement somekind of resources limiting in subprocess (with setrlimit) or droping root privilages in subprocces.
Feb 25 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 February 2013 at 22:32:44 UTC, nazriel wrote:
 Very nice! Good job folks.
Thanks!
 Got question, sorry if it was asked before.

 Is there any way to call some functions after fork but before 
 execve? Somekind of callback approach. It would be required to 
 implement somekind of resources limiting in subprocess (with 
 setrlimit) or droping root privilages in subprocces.
Sorry, no. I don't think we can do that, as it would be very *NIX specific. On Windows there are no separate fork() and exec() calls, just one CreateProcess() call. Lars
Feb 25 2013
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 23 Feb 2013 06:31:19 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 It's been years in the coming, but we finally got it done. :)  The  
 upshot is that the module has actually seen active use over those years,  
 both by yours truly and others, so hopefully the worst wrinkles are  
 already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

 I hope we can get it reviewed in time for the next release.  (The wiki  
 page indicates that both std.benchmark and std.uni are currently being  
 reviewed, but I fail to find any "official" review threads on the  
 forum.  Is the wiki just out of date?)
I just reread the docs, considering Vladimir's point about space-containing no-arg programs. I agree there is a problem. We need to not get rid of the single program version of spawn, we need to simply interpret it as a no-arg program. To have this not work: spawnProcess("c:/Program Files/xyz/xyz.exe"); and require this instead: spawnProcess("c:/Program Files/xyz/xyz.exe", []); is not very intuitive. It reminds me of when we had writefln and not writeln, in order to print out a string with % in it, you had to do writefln("%s", "%s"); Now, I think we have an additional issue in that it's difficult to take a string argument with parameters in it, and pass it in one line: string executeThis = "prog arg1 arg2"; auto params = split(executeThis); spawnProcess(params[0], params[1..$]); It would be nice to just be able to do this: spawnProcess(split(executeThis)); I think we need an overload for that, especially if we get rid of the auto-splitting of commands. It should assert if the array is empty. -Steve
Feb 26 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-02-26 15:22, Steven Schveighoffer wrote:

 It would be nice to just be able to do this:

 spawnProcess(split(executeThis));

 I think we need an overload for that, especially if we get rid of the
 auto-splitting of commands.  It should assert if the array is empty.
How about: spawnProcess(string[] args ...); -- /Jacob Carlborg
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 09:36:24 -0500, Jacob Carlborg <doob me.com> wrote:

 On 2013-02-26 15:22, Steven Schveighoffer wrote:

 It would be nice to just be able to do this:

 spawnProcess(split(executeThis));

 I think we need an overload for that, especially if we get rid of the
 auto-splitting of commands.  It should assert if the array is empty.
How about: spawnProcess(string[] args ...);
Except there are non-string arguments at the end, stdin/stdout/stderr/config -Steve
Feb 26 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-02-26 15:50, Steven Schveighoffer wrote:

 Except there are non-string arguments at the end,
 stdin/stdout/stderr/config
Wow, that was a couple of extra parameters. Didn't actually look before. -- /Jacob Carlborg
Feb 26 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Feb 26, 2013 at 09:22:11AM -0500, Steven Schveighoffer wrote:
[...]
 I just reread the docs, considering Vladimir's point about
 space-containing no-arg programs.  I agree there is a problem.
 
 We need to not get rid of the single program version of spawn, we
 need to simply interpret it as a no-arg program.
 
 To have this not work:
 
 spawnProcess("c:/Program Files/xyz/xyz.exe");
 
 and require this instead:
 
 spawnProcess("c:/Program Files/xyz/xyz.exe", []);
 
 is not very intuitive.
 
 It reminds me of when we had writefln and not writeln, in order to
 print out a string with % in it, you had to do writefln("%s", "%s");
[...] I agree. I think the onus should be on the user to call std.array.split (or equivalent) if he wants to have arguments split on whitespace. Like this: spawnProcess(split("dmd -O myprogram.d mymodule.d")); Not much of a difference in usability, but prevents nasty gotchas like cited above. I vote for spawnProcess to never automatically split on whitespace. T -- Real Programmers use "cat > a.out".
Feb 26 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 February 2013 at 14:22:08 UTC, Steven 
Schveighoffer wrote:
 On Sat, 23 Feb 2013 06:31:19 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 It's been years in the coming, but we finally got it done. :)  
 The upshot is that the module has actually seen active use 
 over those years, both by yours truly and others, so hopefully 
 the worst wrinkles are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

 I hope we can get it reviewed in time for the next release.  
 (The wiki page indicates that both std.benchmark and std.uni 
 are currently being reviewed, but I fail to find any 
 "official" review threads on the forum.  Is the wiki just out 
 of date?)
I just reread the docs, considering Vladimir's point about space-containing no-arg programs. I agree there is a problem. We need to not get rid of the single program version of spawn, we need to simply interpret it as a no-arg program. To have this not work: spawnProcess("c:/Program Files/xyz/xyz.exe"); and require this instead: spawnProcess("c:/Program Files/xyz/xyz.exe", []); is not very intuitive. It reminds me of when we had writefln and not writeln, in order to print out a string with % in it, you had to do writefln("%s", "%s"); Now, I think we have an additional issue in that it's difficult to take a string argument with parameters in it, and pass it in one line: string executeThis = "prog arg1 arg2"; auto params = split(executeThis); spawnProcess(params[0], params[1..$]); It would be nice to just be able to do this: spawnProcess(split(executeThis)); I think we need an overload for that, especially if we get rid of the auto-splitting of commands. It should assert if the array is empty.
I propose we only have two versions: spawnProcess(string[] args, File stdin, etc...) spawnProcess(string[] args, string[string] env, File stdin, etc...) You'd use it like this: spawnProcess(["prog"]); spawnProcess(["prog", "arg1", "arg2"]) etc. Then it would also work with split(). Lars
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Tuesday, 26 February 2013 at 14:22:08 UTC, Steven Schveighoffer wrote:
 On Sat, 23 Feb 2013 06:31:19 -0500, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:

 It's been years in the coming, but we finally got it done. :)  The  
 upshot is that the module has actually seen active use over those  
 years, both by yours truly and others, so hopefully the worst wrinkles  
 are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html

 I hope we can get it reviewed in time for the next release.  (The wiki  
 page indicates that both std.benchmark and std.uni are currently being  
 reviewed, but I fail to find any "official" review threads on the  
 forum.  Is the wiki just out of date?)
I just reread the docs, considering Vladimir's point about space-containing no-arg programs. I agree there is a problem. We need to not get rid of the single program version of spawn, we need to simply interpret it as a no-arg program. To have this not work: spawnProcess("c:/Program Files/xyz/xyz.exe"); and require this instead: spawnProcess("c:/Program Files/xyz/xyz.exe", []); is not very intuitive. It reminds me of when we had writefln and not writeln, in order to print out a string with % in it, you had to do writefln("%s", "%s"); Now, I think we have an additional issue in that it's difficult to take a string argument with parameters in it, and pass it in one line: string executeThis = "prog arg1 arg2"; auto params = split(executeThis); spawnProcess(params[0], params[1..$]); It would be nice to just be able to do this: spawnProcess(split(executeThis)); I think we need an overload for that, especially if we get rid of the auto-splitting of commands. It should assert if the array is empty.
I propose we only have two versions: spawnProcess(string[] args, File stdin, etc...) spawnProcess(string[] args, string[string] env, File stdin, etc...) You'd use it like this: spawnProcess(["prog"]);
That allocates. I don't like that requirement. At the very least there should be version which takes a simple string, an easy thing to wrap: auto spawnProcess(string program, File stdin, etc...) { return spawnProcess((&program)[0..1], stdin, etc...); } We should also consider a variadic solution. In tango, things were done with an object, so the arguments were set via one method/constructor, and the options (stdin, stdout, etc) were set via another. This allowed the great API of setArgs(string[] ...) Which supports setArgs("progname", "arg1", "arg2") and setArgs("progname arg1 arg2".split()) without extra allocation. However, we have two conflicting parts to spawnProcess that would be optional -- the variadic arg list, and the optional redirected handles and configuration. We could just go full-bore variadic... -Steve
Feb 26 2013
next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 February 2013 at 16:45:08 UTC, Steven 
Schveighoffer wrote:
 On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:
 I propose we only have two versions:

 spawnProcess(string[] args, File stdin, etc...)
 spawnProcess(string[] args, string[string] env, File stdin, 
 etc...)

 You'd use it like this:

 spawnProcess(["prog"]);
That allocates. I don't like that requirement.
'scope string[] args' should tell the compiler not to allocate.
 At the very least there should be version which takes a simple 
 string, an easy thing to wrap:

 auto spawnProcess(string program, File stdin, etc...)
 {
    return spawnProcess((&program)[0..1], stdin, etc...);
 }

 We should also consider a variadic solution.  In tango, things 
 were done with an object, so the arguments were set via one 
 method/constructor, and the options (stdin, stdout, etc) were 
 set via another.  This allowed the great API of

 setArgs(string[] ...)

 Which supports

 setArgs("progname", "arg1", "arg2")

 and

 setArgs("progname arg1 arg2".split())

 without extra allocation.  However, we have two conflicting 
 parts to spawnProcess that would be optional -- the variadic 
 arg list, and the optional redirected handles and configuration.

 We could just go full-bore variadic...
I'd rather not. Lars
Feb 26 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 12:03:51 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Tuesday, 26 February 2013 at 16:45:08 UTC, Steven Schveighoffer wrote:
 On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:
 spawnProcess(["prog"]);
That allocates. I don't like that requirement.
'scope string[] args' should tell the compiler not to allocate.
That's not how it works. The expression [<anything>] allocates.
 At the very least there should be version which takes a simple string,  
 an easy thing to wrap:

 auto spawnProcess(string program, File stdin, etc...)
 {
    return spawnProcess((&program)[0..1], stdin, etc...);
 }

 We should also consider a variadic solution.  In tango, things were  
 done with an object, so the arguments were set via one  
 method/constructor, and the options (stdin, stdout, etc) were set via  
 another.  This allowed the great API of

 setArgs(string[] ...)

 Which supports

 setArgs("progname", "arg1", "arg2")

 and

 setArgs("progname arg1 arg2".split())

 without extra allocation.  However, we have two conflicting parts to  
 spawnProcess that would be optional -- the variadic arg list, and the  
 optional redirected handles and configuration.

 We could just go full-bore variadic...
I'd rather not.
Did that apply to all the statements above, or just the variadic part? -Steve
Feb 26 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 February 2013 at 18:24:08 UTC, Steven 
Schveighoffer wrote:
 On Tue, 26 Feb 2013 12:03:51 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 On Tuesday, 26 February 2013 at 16:45:08 UTC, Steven 
 Schveighoffer wrote:
 On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:
 spawnProcess(["prog"]);
That allocates. I don't like that requirement.
'scope string[] args' should tell the compiler not to allocate.
That's not how it works. The expression [<anything>] allocates.
Isn't that just a shortcoming of DMD? I thought 'scope' was all about avoiding such allocations.
 At the very least there should be version which takes a 
 simple string, an easy thing to wrap:

 auto spawnProcess(string program, File stdin, etc...)
 {
   return spawnProcess((&program)[0..1], stdin, etc...);
 }

 We should also consider a variadic solution.  In tango, 
 things were done with an object, so the arguments were set 
 via one method/constructor, and the options (stdin, stdout, 
 etc) were set via another.  This allowed the great API of

 setArgs(string[] ...)

 Which supports

 setArgs("progname", "arg1", "arg2")

 and

 setArgs("progname arg1 arg2".split())

 without extra allocation.  However, we have two conflicting 
 parts to spawnProcess that would be optional -- the variadic 
 arg list, and the optional redirected handles and 
 configuration.

 We could just go full-bore variadic...
I'd rather not.
Did that apply to all the statements above, or just the variadic part?
The variadic part. (And the Tango-like object part, though that didn't read like a suggestion.) A single-string version of spawnProcess() is OK with me, in combination with either of: spawnProcess(string, string[], etc.) spawnProcess(string[], etc.) Lars
Feb 26 2013
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, February 26, 2013 20:33:14 Lars T. Kyllingstad wrote:
 On Tuesday, 26 February 2013 at 18:24:08 UTC, Steven
 
 Schveighoffer wrote:
 On Tue, 26 Feb 2013 12:03:51 -0500, Lars T. Kyllingstad
 
 <public kyllingen.net> wrote:
 On Tuesday, 26 February 2013 at 16:45:08 UTC, Steven
 
 Schveighoffer wrote:
 On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad
 
 <public kyllingen.net> wrote:
 spawnProcess(["prog"]);
That allocates. I don't like that requirement.
'scope string[] args' should tell the compiler not to allocate.
That's not how it works. The expression [<anything>] allocates.
Isn't that just a shortcoming of DMD? I thought 'scope' was all about avoiding such allocations.
scope is all about enforcing that what's being passed to a function does not escape it. To quote the docs ( http://dlang.org/function.html ). --------- ref­er­ences in the pa­ra­me­ter can­not be es­caped (e.g. as­signed to a global vari­able) --------- That has the added benefit of allowing the compiler to make optimizations (like not allocating a closure), but scope in and of itself doesn't necessarily mean that no allocation will occur. The spec says _nothing_ on that count. It doesn't even discuss scope in relation to delegates. All that's guaranteed is that no references to parameters marked with scope will escape. However, regardless of all that, scope is currently only implemented for delegates (and even there, it's fairly buggy IIRC), so it will have zero effect on an array. - Jonathan M Davis
Feb 26 2013
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 14:33:14 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Tuesday, 26 February 2013 at 18:24:08 UTC, Steven Schveighoffer wrote:
 That's not how it works.  The expression [<anything>] allocates.
Isn't that just a shortcoming of DMD? I thought 'scope' was all about avoiding such allocations.
Not that I am aware of. Any array expression calls _d_newArray. Even enums of array expressions call that every time they are used.
 The variadic part.  (And the Tango-like object part, though that didn't  
 read like a suggestion.)  A single-string version of spawnProcess() is  
 OK with me, in combination with either of:

    spawnProcess(string, string[], etc.)
    spawnProcess(string[], etc.)
OK, that sounds fine. The tango thing was just for background (because Process was an object, you could manipulate it before executing, thus allowing the variadic setting of arguments). -Steve
Feb 26 2013
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 26 February 2013 at 16:45:08 UTC, Steven 
Schveighoffer wrote:
 On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:
 You'd use it like this:

 spawnProcess(["prog"]);
That allocates. I don't like that requirement.
I know this is debatable and that we've discussed this before, but I feel I should still mention that the cost of one more small allocation will be absolutely negligible compared to the cost of creating a new process, even taking into account long-term effects of heap fragmentation and such. I don't think that API design should suffer for such a small performance cost.
Feb 26 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 26 Feb 2013 12:10:00 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 26 February 2013 at 16:45:08 UTC, Steven Schveighoffer wrote:
 On Tue, 26 Feb 2013 11:09:48 -0500, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:
 You'd use it like this:

 spawnProcess(["prog"]);
That allocates. I don't like that requirement.
I know this is debatable and that we've discussed this before, but I feel I should still mention that the cost of one more small allocation will be absolutely negligible compared to the cost of creating a new process, even taking into account long-term effects of heap fragmentation and such. I don't think that API design should suffer for such a small performance cost.
Even the API is ugly. It takes no significant code or really understanding to make a single-arg spawnProcess that does the same thing. -Steve
Feb 26 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-02-26 17:45, Steven Schveighoffer wrote:

 without extra allocation.  However, we have two conflicting parts to
 spawnProcess that would be optional -- the variadic arg list, and the
 optional redirected handles and configuration.

 We could just go full-bore variadic...
I'm thinking named parameters, unfortunately we don't have that :( -- /Jacob Carlborg
Feb 26 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Feb 26, 2013 at 09:07:06PM +0100, Jacob Carlborg wrote:
 On 2013-02-26 17:45, Steven Schveighoffer wrote:
 
without extra allocation.  However, we have two conflicting parts to
spawnProcess that would be optional -- the variadic arg list, and the
optional redirected handles and configuration.

We could just go full-bore variadic...
I'm thinking named parameters, unfortunately we don't have that :(
[...] Y'know, it would be nice if AA literal syntax could be used for that purpose, if the compiler could be made aware of its meaning so that no actual allocation is done at runtime. But maybe it's a bit too late for that. T -- PNP = Plug 'N' Pray
Feb 26 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-02-26 21:29, H. S. Teoh wrote:

 Y'know, it would be nice if AA literal syntax could be used for that
 purpose, if the compiler could be made aware of its meaning so that no
 actual allocation is done at runtime.

 But maybe it's a bit too late for that.
I had this idea that could be used for named parameters: http://forum.dlang.org/thread/kfbnuc$1cro$1 digitalmars.com -- /Jacob Carlborg
Feb 27 2013
prev sibling next sibling parent reply =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Mini thing: Redirect.none is not documented
Mar 03 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 3 March 2013 at 11:00:52 UTC, Sönke Ludwig wrote:
 Mini thing: Redirect.none is not documented
Ok, thanks!
Mar 03 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 4 March 2013 at 06:51:15 UTC, Lars T. Kyllingstad 
wrote:
 On Sunday, 3 March 2013 at 11:00:52 UTC, Sönke Ludwig wrote:
 Mini thing: Redirect.none is not documented
Ok, thanks!
I ended up simply removing it. There is no point in calling pipeProcess without any redirection at all. Lars
Mar 05 2013
parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 05.03.2013 21:12, schrieb Lars T. Kyllingstad:
 On Monday, 4 March 2013 at 06:51:15 UTC, Lars T. Kyllingstad wrote:
 On Sunday, 3 March 2013 at 11:00:52 UTC, Sönke Ludwig wrote:
 Mini thing: Redirect.none is not documented
Ok, thanks!
I ended up simply removing it. There is no point in calling pipeProcess without any redirection at all. Lars
OK, I was actually using pipeShell() with Redirect.none to get the output simply passed on to the console, but I simply overlooked that this is the job of spawnShell().
Mar 06 2013
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 23 February 2013 at 11:31:21 UTC, Lars T. 
Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :)  
 The upshot is that the module has actually seen active use over 
 those years, both by yours truly and others, so hopefully the 
 worst wrinkles are already ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
Ok, a new version is up. I think I have adressed the concerns that were brought up earlier, but please speak up if I've missed something that we agreed on. A special thanks to Vladimir P. for pointing out an egregious flaw in the original design. Lars
Mar 05 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 5 March 2013 at 20:19:06 UTC, Lars T. Kyllingstad 
wrote:
 A special thanks to Vladimir P. for pointing out an egregious 
 flaw in the original design.
But wait, there's more! (please don't hurt me) 1. Typo: "plattform" 2. Is there any meaning in the idea of consolidating spawnProcess/pipeProcess/execute and spawnShell/pipeShell/shell? How about that collectOutput idea? 3. Where are we with compatibility with the old module? One idea I haven't seen mentioned yet is: perhaps we could make the return value of "shell" have a deprecated "alias this" to the output string, so that it's implicitly convertible to a string to preserve compatibility. 4. Is there any way to deal with pipe clogging (pipe buffer getting exceeded when manually handling both input and output of a subprocess)? Can we query the number of bytes we can immediately read/write without blocking on a File? 5. How about that Environment.opIn_r? Great work so far otherwise!
Mar 05 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 4. Is there any way to deal with pipe clogging (pipe buffer getting  
 exceeded when manually handling both input and output of a subprocess)?  
 Can we query the number of bytes we can immediately read/write without  
 blocking on a File?
I don't know how this could happen, can you elaborate? Perhaps an example? We are sort of stuck with File being the stream handler in phobos, which means we are currently stuck with FILE *. I don't know if there is a way to do partial reads/writes on a FILE *, or checking to see if data is available. -Steve
Mar 05 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 5 March 2013 at 21:55:24 UTC, Steven Schveighoffer 
wrote:
 On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 4. Is there any way to deal with pipe clogging (pipe buffer 
 getting exceeded when manually handling both input and output 
 of a subprocess)? Can we query the number of bytes we can 
 immediately read/write without blocking on a File?
I don't know how this could happen, can you elaborate? Perhaps an example?
OK! Here's a program based off the pipeProcess/pipeShell example: --- import std.file; import std.process2; import std.stdio; import std.string; void main() { auto pipes = pipeProcess("./my_application", Redirect.stdout | Redirect.stderr); scope(exit) wait(pipes.pid); // Store lines of output. string[] output; foreach (line; pipes.stdout.byLine) output ~= line.idup; // Store lines of errors. string[] errors; foreach (line; pipes.stderr.byLine) errors ~= line.idup; writefln("%d lines of stdout, %d lines of stderr", output.length, errors.length); } --- And here is an accompanying my_application.d: --- import std.stdio; enum N = 100; void main() { foreach (n; 0..N) { stdout.writeln("stdout"); stderr.writeln("stderr"); } } --- Now, everything works just fine when N is small. However, if you increase it to 10000, both the test program and my_application get stuck with 0% CPU usage. The reason for that is that the stderr pipe is clogged: my_application can't write to it, because nothing is reading from the other end. At the same time, the first program is blocked on reading from the stdout pipe, but nothing is coming out, because my_application is blocked on writing to stderr. By the way, I should mention that I ran into several issues while trying to come up with the above example. The test program does not work on Windows, for some reason I get the exception: std.process2.ProcessException std\process2.d(494): Failed to spawn new process (The parameter is incorrect.) I've also initially tried writing a different program: --- import std.file; import std.process2; import std.string; /// Sort an array of strings using the Unix "sort" program. string[] unixSort(string[] lines) { auto pipes = pipeProcess("sort", Redirect.stdin | Redirect.stdout); scope(exit) wait(pipes.pid); foreach (line; lines) pipes.stdin.writeln(line); pipes.stdin.close(); string[] sortedLines; foreach (line; pipes.stdout.byLine()) sortedLines ~= line.idup; return sortedLines; } void main() { // For the sake of example, pretend these lines came from // some intensive computation, and not actually a file. auto lines = readText("input.txt").splitLines(); auto sortedLines = unixSort(lines); } --- However, I couldn't get it to work neither on Windows (same exception) nor Linux (it just gets stuck, even with a very small input.txt). No idea if I'm doing something wrong (maybe I need to indicate EOF in some way?) or if the problem is elsewhere.
 We are sort of stuck with File being the stream handler in 
 phobos, which means we are currently stuck with FILE *.  I 
 don't know if there is a way to do partial reads/writes on a 
 FILE *, or checking to see if data is available.
I guess you could always get the OS file handles/descriptors and query them directly, although there's also the matter of the internal FILE * buffers.
Mar 05 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 5 March 2013 at 22:38:11 UTC, Vladimir Panteleev 
wrote:
 By the way, I should mention that I ran into several issues 
 while trying to come up with the above example. The test 
 program does not work on Windows, for some reason I get the 
 exception:

 std.process2.ProcessException std\process2.d(494): Failed to 
 spawn new process (The parameter is incorrect.)
"The parameter is incorrect" is a Windows system error message. Apparently, there is something wrong with one of the parameters we pass to CreateProcessW. I don't have my dev computer with me now, but my first guess would be the command line or one of the pipe handles. I'll check it out.
 I've also initially tried writing a different program:

 [...]

 However, I couldn't get it to work neither on Windows (same 
 exception) nor Linux (it just gets stuck, even with a very 
 small input.txt). No idea if I'm doing something wrong (maybe I 
 need to indicate EOF in some way?) or if the problem is 
 elsewhere.
Usually, when such things have happened to me, it is because I've forgotten to flush a stream. That doesn't seem to be the case here, though, since you close pipes.stdin manually. Do you know where the program gets stuck? I guess it is the read loop, but if you could verify that, it would be great. Lars
Mar 05 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 5 March 2013 at 21:55:24 UTC, Steven Schveighoffer wrote:
 On Tue, 05 Mar 2013 16:04:14 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 4. Is there any way to deal with pipe clogging (pipe buffer getting  
 exceeded when manually handling both input and output of a  
 subprocess)? Can we query the number of bytes we can immediately  
 read/write without blocking on a File?
I don't know how this could happen, can you elaborate? Perhaps an example?
OK! Here's a program based off the pipeProcess/pipeShell example: --- import std.file; import std.process2; import std.stdio; import std.string; void main() { auto pipes = pipeProcess("./my_application", Redirect.stdout | Redirect.stderr); scope(exit) wait(pipes.pid); // Store lines of output. string[] output; foreach (line; pipes.stdout.byLine) output ~= line.idup; // Store lines of errors. string[] errors; foreach (line; pipes.stderr.byLine) errors ~= line.idup; writefln("%d lines of stdout, %d lines of stderr", output.length, errors.length); } --- And here is an accompanying my_application.d: --- import std.stdio; enum N = 100; void main() { foreach (n; 0..N) { stdout.writeln("stdout"); stderr.writeln("stderr"); } } --- Now, everything works just fine when N is small. However, if you increase it to 10000, both the test program and my_application get stuck with 0% CPU usage. The reason for that is that the stderr pipe is clogged: my_application can't write to it, because nothing is reading from the other end. At the same time, the first program is blocked on reading from the stdout pipe, but nothing is coming out, because my_application is blocked on writing to stderr.
Right, the issue there is, File does not make a good socket/pipe interface. I don't know what to do about that. a while ago (2008 or 09 I believe?), I was using Tango's Process object to execute programs on a remote agent, and forwarding all the resulting data back over the network. On Linux, I used select to read data as it arrived. On Windows, I think I had to spawn off a separate thread to wait for data/child processes. But Tango did not base it's I/O on FILE *, so I think we had more flexibility there. Suggestions are welcome...
 By the way, I should mention that I ran into several issues while trying  
 to come up with the above example. The test program does not work on  
 Windows, for some reason I get the exception:

 std.process2.ProcessException std\process2.d(494): Failed to spawn new  
 process (The parameter is incorrect.)
I think Lars is on that.
 I've also initially tried writing a different program:

 ---
 import std.file;
 import std.process2;
 import std.string;

 /// Sort an array of strings using the Unix "sort" program.
 string[] unixSort(string[] lines)
 {
 	auto pipes = pipeProcess("sort", Redirect.stdin | Redirect.stdout);
 	scope(exit) wait(pipes.pid);

 	foreach (line; lines)
 		pipes.stdin.writeln(line);
 	pipes.stdin.close();

 	string[] sortedLines;
 	foreach (line; pipes.stdout.byLine())
 		sortedLines ~= line.idup;

 	return sortedLines;
 }

 void main()
 {
 	// For the sake of example, pretend these lines came from
 	// some intensive computation, and not actually a file.
 	auto lines = readText("input.txt").splitLines();

 	auto sortedLines = unixSort(lines);
 }
 ---

 However, I couldn't get it to work neither on Windows (same exception)  
 nor Linux (it just gets stuck, even with a very small input.txt). No  
 idea if I'm doing something wrong (maybe I need to indicate EOF in some  
 way?) or if the problem is elsewhere.
Linux should work here. From what I can tell, you are doing it right. If I get some time, I'll try and debug this.
 We are sort of stuck with File being the stream handler in phobos,  
 which means we are currently stuck with FILE *.  I don't know if there  
 is a way to do partial reads/writes on a FILE *, or checking to see if  
 data is available.
I guess you could always get the OS file handles/descriptors and query them directly, although there's also the matter of the internal FILE * buffers.
I think at that point, you would have to forgo all usage of File niceties (writeln, etc). Which would really suck. But on the read end, this is a very viable option. -Steve
Mar 06 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:


 a while ago (2008 or 09 I believe?), I was using Tango's Process object  
 to execute programs on a remote agent, and forwarding all the resulting  
 data back over the network.  On Linux, I used select to read data as it  
 arrived.  On Windows, I think I had to spawn off a separate thread to  
 wait for data/child processes.
More coming back to me now -- Windows pipes actually suck quite a bit. You can't use the normal mechanisms to wait for data on them. I also needed to spawn threads so I could combine the event-driven wait for socket data from the remote instance with the data from the pipes. I seem to remember opening a socket to my own process in order to do this. -Steve
Mar 06 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
06-Mar-2013 21:00, Steven Schveighoffer пишет:
 On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:


 a while ago (2008 or 09 I believe?), I was using Tango's Process
 object to execute programs on a remote agent, and forwarding all the
 resulting data back over the network.  On Linux, I used select to read
 data as it arrived.  On Windows, I think I had to spawn off a separate
 thread to wait for data/child processes.
More coming back to me now -- Windows pipes actually suck quite a bit. You can't use the normal mechanisms to wait for data on them. I also needed to spawn threads so I could combine the event-driven wait for socket data from the remote instance with the data from the pipes. I seem to remember opening a socket to my own process in order to do this.
There is async read/write on pipes. Though no wait on pipes does suck.
 -Steve
-- Dmitry Olshansky
Mar 06 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 06 Mar 2013 16:57:39 -0500, Dmitry Olshansky  =

<dmitry.olsh gmail.com> wrote:

 06-Mar-2013 21:00, Steven Schveighoffer =D0=BF=D0=B8=D1=88=D0=B5=D1=82=
:
 On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:


 a while ago (2008 or 09 I believe?), I was using Tango's Process
 object to execute programs on a remote agent, and forwarding all the=
 resulting data back over the network.  On Linux, I used select to re=
ad
 data as it arrived.  On Windows, I think I had to spawn off a separa=
te
 thread to wait for data/child processes.
More coming back to me now -- Windows pipes actually suck quite a bit=
.
 You can't use the normal mechanisms to wait for data on them.

 I also needed to spawn threads so I could combine the event-driven wa=
it
 for socket data from the remote instance with the data from the pipes=
.
 I seem to remember opening a socket to my own process in order to do =
=
 this.
There is async read/write on pipes. Though no wait on pipes does suck.
Hm... I noted in the docs that async read/write is not supported: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365141(v=3Dvs.= 85).aspx "Asynchronous (overlapped) read and write operations are not supported b= y = anonymous pipes. This means that you cannot use the ReadFileEx and = WriteFileEx functions with anonymous pipes. In addition, the lpOverlappe= d = parameter of ReadFile and WriteFile is ignored when these functions are = = used with anonymous pipes." -Steve
Mar 07 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
07-Mar-2013 16:50, Steven Schveighoffer пишет:
 On Wed, 06 Mar 2013 16:57:39 -0500, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 06-Mar-2013 21:00, Steven Schveighoffer пишет:
 On Wed, 06 Mar 2013 11:45:54 -0500, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:


 a while ago (2008 or 09 I believe?), I was using Tango's Process
 object to execute programs on a remote agent, and forwarding all the
 resulting data back over the network.  On Linux, I used select to read
 data as it arrived.  On Windows, I think I had to spawn off a separate
 thread to wait for data/child processes.
More coming back to me now -- Windows pipes actually suck quite a bit. You can't use the normal mechanisms to wait for data on them. I also needed to spawn threads so I could combine the event-driven wait for socket data from the remote instance with the data from the pipes. I seem to remember opening a socket to my own process in order to do this.
There is async read/write on pipes. Though no wait on pipes does suck.
Hm... I noted in the docs that async read/write is not supported: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365141(v=vs.85).aspx "Asynchronous (overlapped) read and write operations are not supported by anonymous pipes. This means that you cannot use the ReadFileEx and WriteFileEx functions with anonymous pipes. In addition, the lpOverlapped parameter of ReadFile and WriteFile is ignored when these functions are used with anonymous pipes."
Hm.. how shitty. Especially since: "Anonymous pipes are implemented using a named pipe with a unique name. Therefore, you can often pass a handle to an anonymous pipe to a function that requires a handle to a named pipe." And e.g. this (Named pipe using overlapped I/O): http://msdn.microsoft.com/en-us/library/windows/desktop/aa365603(v=vs.85).aspx -- Dmitry Olshansky
Mar 07 2013
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer 
wrote:
 On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:
 By the way, I should mention that I ran into several issues 
 while trying to come up with the above example. The test 
 program does not work on Windows, for some reason I get the 
 exception:

 std.process2.ProcessException std\process2.d(494): Failed to 
 spawn new process (The parameter is incorrect.)
I think Lars is on that.
I will be, but I don't know when. It may be a few days. So if you have the time and you feel like it, feel free to have a look at it. :) Lars
Mar 06 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer 
wrote:
 On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 By the way, I should mention that I ran into several issues 
 while trying to come up with the above example. The test 
 program does not work on Windows, for some reason I get the 
 exception:

 std.process2.ProcessException std\process2.d(494): Failed to 
 spawn new process (The parameter is incorrect.)
I think Lars is on that.
I'm going to need som help with this one. I only have Linux on my computer, and I can't reproduce the bug in Wine. As a first step, could someone else try to run Vladimir's test case?
 I've also initially tried writing a different program:

 [...]
Linux should work here. From what I can tell, you are doing it right. If I get some time, I'll try and debug this.
I think I know what the problem is, and it sucks bigtime. :( Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process. We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it. In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child. I don't know how to solve this in a good way. I can think of a few alternatives, and they all suck: 1. Make a "special" spawnProcess() function for pipe redirection. 2. Use the "process object" approach, like Tango and Qt. 3. After fork(), in the child process, loop over the full range of possible file descriptors and close the ones we don't want open. The last one would let us keep the current API (and would have the added benefit of cleaning up unused FDs) but I have no idea how it would impact performance. Lars
Mar 09 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 9 March 2013 at 16:05:15 UTC, Lars T. Kyllingstad 
wrote:
 I think I know what the problem is, and it sucks bigtime. :(

 Since the child process inherits the parent's open file 
 descriptors, both ends of a pipe will be open in the child 
 process.  We have separated pipe creation and process creation, 
 so spawnProcess() knows nothing about the "other" end of the 
 pipe it receives, and is therefore unable to close it.

 In this particular case, the problem is that "sort" doesn't do 
 anything until it receives EOF on standard input, which never 
 happens, because even though the write end of the pipe is 
 closed in the parent process, it is still open in the child.

 I don't know how to solve this in a good way.  I can think of a 
 few alternatives, and they all suck:

 1. Make a "special" spawnProcess() function for pipe 
 redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full range 
 of possible file descriptors and close the ones we don't want 
 open.

 The last one would let us keep the current API (and would have 
 the added benefit of cleaning up unused FDs) but I have no idea 
 how it would impact performance.
I have tried (3), and confirmed that it does indeed solve the problem. Lars
Mar 09 2013
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer wrote:
 On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:
 I've also initially tried writing a different program:

 [...]
Linux should work here. From what I can tell, you are doing it right. If I get some time, I'll try and debug this.
I think I know what the problem is, and it sucks bigtime. :( Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process. We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it. In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child.
Oh crap, that is bad. Unlike Windows which is an opt-in strategy, unix has an opt-out strategy (there is the F_CLOEXEC flag). For consistency, I think it would be good to close all the file descriptors before calling exec.
 I don't know how to solve this in a good way.  I can think of a few  
 alternatives, and they all suck:

 1. Make a "special" spawnProcess() function for pipe redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full range of  
 possible file descriptors and close the ones we don't want open.

 The last one would let us keep the current API (and would have the added  
 benefit of cleaning up unused FDs) but I have no idea how it would  
 impact performance.
I think 3 is the correct answer, it is consistent with Windows, and the most logical behavior. For instance, if other threads are open and doing other things that aren't related (like network sockets), those too will be inherited! We should close all file descriptors. How do you loop over all open ones? Just curious :) -Steve
Mar 09 2013
next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 9 March 2013 at 18:35:25 UTC, Steven Schveighoffer 
wrote:
 On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven 
 Schveighoffer wrote:
 On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:
 I've also initially tried writing a different program:

 [...]
Linux should work here. From what I can tell, you are doing it right. If I get some time, I'll try and debug this.
I think I know what the problem is, and it sucks bigtime. :( Since the child process inherits the parent's open file descriptors, both ends of a pipe will be open in the child process. We have separated pipe creation and process creation, so spawnProcess() knows nothing about the "other" end of the pipe it receives, and is therefore unable to close it. In this particular case, the problem is that "sort" doesn't do anything until it receives EOF on standard input, which never happens, because even though the write end of the pipe is closed in the parent process, it is still open in the child.
Oh crap, that is bad. Unlike Windows which is an opt-in strategy, unix has an opt-out strategy (there is the F_CLOEXEC flag). For consistency, I think it would be good to close all the file descriptors before calling exec.
 I don't know how to solve this in a good way.  I can think of 
 a few alternatives, and they all suck:

 1. Make a "special" spawnProcess() function for pipe 
 redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full 
 range of possible file descriptors and close the ones we don't 
 want open.

 The last one would let us keep the current API (and would have 
 the added benefit of cleaning up unused FDs) but I have no 
 idea how it would impact performance.
I think 3 is the correct answer, it is consistent with Windows, and the most logical behavior. For instance, if other threads are open and doing other things that aren't related (like network sockets), those too will be inherited! We should close all file descriptors.
I think so too. In C, you have to know about these things, and they are specified in the documentation for fork() and exec(). In D you shouldn't have to know, things should "just work" the way you expect them to.
 How do you loop over all open ones?  Just curious :)
You don't. That is why I said solution (3) sucks too. :) You have to loop over all possible non-std file descriptors, i.e. from 3 to the maximum number of open files. (On my Ubuntu installation, this is by default 1024, but may be as much as 4096. I don't know about other *NIXes) Here is how to do it: import core.sys.posix.unistd, core.sys.posix.sys.resource; rlimit r; getrlimit(RLIMIT_NOFILE, &r); for (int i = 0; i < r.rlim_cur; ++i) close(i); Lars
Mar 09 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 9 March 2013 at 18:44:45 UTC, Lars T. Kyllingstad 
wrote:
 On Saturday, 9 March 2013 at 18:35:25 UTC, Steven Schveighoffer 
 wrote:
 How do you loop over all open ones?  Just curious :)
You don't. That is why I said solution (3) sucks too. :) You have to loop over all possible non-std file descriptors, i.e. from 3 to the maximum number of open files. (On my Ubuntu installation, this is by default 1024, but may be as much as 4096. I don't know about other *NIXes) Here is how to do it: import core.sys.posix.unistd, core.sys.posix.sys.resource; rlimit r; getrlimit(RLIMIT_NOFILE, &r); for (int i = 0; i < r.rlim_cur; ++i) close(i);
BTW, core.sys.posix.sys.resource currently doesn't exist, we have to create it. Lars
Mar 09 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 13:44:44 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Saturday, 9 March 2013 at 18:35:25 UTC, Steven Schveighoffer wrote:
 How do you loop over all open ones?  Just curious :)
You don't. That is why I said solution (3) sucks too. :) You have to loop over all possible non-std file descriptors, i.e. from 3 to the maximum number of open files. (On my Ubuntu installation, this is by default 1024, but may be as much as 4096. I don't know about other *NIXes) Here is how to do it: import core.sys.posix.unistd, core.sys.posix.sys.resource; rlimit r; getrlimit(RLIMIT_NOFILE, &r); for (int i = 0; i < r.rlim_cur; ++i) close(i);
Hm... don't close 0, 1, 2 :) On Linux at least, you could use /proc/self/fd I suppose it's faster just to loop though. How long does it take when you close non-open descriptors? We don't want to hamper performance too much. I wonder if select on all possible file descriptors in the fd_err parameter would give you a clue as to which were invalid. -Steve
Mar 09 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 13:54:32 -0500, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:


 I wonder if select on all possible file descriptors in the fd_err  
 parameter would give you a clue as to which were invalid.
Hm... seems like select returns -1 if an invalid descriptor is included. That probably means it won't flag all of them.. -Steve
Mar 09 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Saturday, 9 March 2013 at 18:54:31 UTC, Steven Schveighoffer 
wrote:
 On Sat, 09 Mar 2013 13:44:44 -0500, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 On Saturday, 9 March 2013 at 18:35:25 UTC, Steven 
 Schveighoffer wrote:
 How do you loop over all open ones?  Just curious :)
You don't. That is why I said solution (3) sucks too. :) You have to loop over all possible non-std file descriptors, i.e. from 3 to the maximum number of open files. (On my Ubuntu installation, this is by default 1024, but may be as much as 4096. I don't know about other *NIXes) Here is how to do it: import core.sys.posix.unistd, core.sys.posix.sys.resource; rlimit r; getrlimit(RLIMIT_NOFILE, &r); for (int i = 0; i < r.rlim_cur; ++i) close(i);
Hm... don't close 0, 1, 2 :) On Linux at least, you could use /proc/self/fd I suppose it's faster just to loop though. How long does it take when you close non-open descriptors? We don't want to hamper performance too much.
On my computer, with 1024 (minus 3 ;))possible file descriptors, it roughly doubles the time spent inside spawnProcess() up to, but not including, the excecve() call. (About 0.1 microsecond per file descriptor.) Considering that execve() probebly dwarfs that number, I think we're in good shape. Of course, we have a problem if some other platform allows ulong.max open files...
 I wonder if select on all possible file descriptors in the 
 fd_err parameter would give you a clue as to which were invalid.
My guess is that select() uses more or less the same mechanism as close() for checking FD validity, and thus would gain us nothing. Lars
Mar 09 2013
next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 14:24:49 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 My guess is that select() uses more or less the same mechanism as  
 close() for checking FD validity, and thus would gain us nothing.
Yeah, I was hoping select would flag each "non-valid" file descriptor in fd_errors, and do it all in an internal loop in the kernel instead of going back and forth between user space and kernel space. But I don't think it does that. It just errors the whole function on the first invalid descriptor it sees. Go with your workaround, sounds reasonable. I agree that 100us pales in comparison to launching a program. Perhaps an option to disable this behavior should be available in Config. It certainly is possible with the F_CLOEXEC flag to manually do what needs to be done. -Steve
Mar 09 2013
prev sibling parent "jerro" <a a.com> writes:
 Of course, we have a problem if some other platform allows 
 ulong.max open files...
You can increase (on Linux) maximal number of open files by adding something like your_username hard nofile 65536 to /etc/security/limits.conf and using ulimit. You can increase it up to /proc/sys/fs/file-max (which is 394062 by default on my machine but can also be increased) that way. I don't know what the maximal valid value of /proc/sys/fs/file-max is, but it can not be more than int.max, since file descriptors are ints.
Mar 10 2013
prev sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sat, 09 Mar 2013 13:35:26 -0500
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:
 
 1. Make a "special" spawnProcess() function for pipe redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full range of  
 possible file descriptors and close the ones we don't want open.

 The last one would let us keep the current API (and would have the added  
 benefit of cleaning up unused FDs) but I have no idea how it would  
 impact performance.
I think 3 is the correct answer, it is consistent with Windows, and the most logical behavior. For instance, if other threads are open and doing other things that aren't related (like network sockets), those too will be inherited! We should close all file descriptors.
So that means on Posix any programming scheme involving passing open descriptors on to child processes is not going to work with std.process? Not that I know of any, but if that's what will happen it is good to know the cost. ;) -- Marco
Mar 10 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 10 March 2013 at 16:04:51 UTC, Marco Leise wrote:
 Am Sat, 09 Mar 2013 13:35:26 -0500
 schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:
 
 1. Make a "special" spawnProcess() function for pipe 
 redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full 
 range of  possible file descriptors and close the ones we 
 don't want open.

 The last one would let us keep the current API (and would 
 have the added  benefit of cleaning up unused FDs) but I 
 have no idea how it would  impact performance.
I think 3 is the correct answer, it is consistent with Windows, and the most logical behavior. For instance, if other threads are open and doing other things that aren't related (like network sockets), those too will be inherited! We should close all file descriptors.
So that means on Posix any programming scheme involving passing open descriptors on to child processes is not going to work with std.process? Not that I know of any, but if that's what will happen it is good to know the cost. ;)
I think the idea is that if you rely on such platform-specific behavior, you probably shouldn't be using std.process in the first place - as its goal is to provide high-level cross-platform abstractions for the most common operations, rather than try to cover all features exposed by all operating system APIs.
Mar 10 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 10 Mar 2013 17:07:26 +0100
schrieb "Vladimir Panteleev" <vladimir thecybershadow.net>:

 I think the idea is that if you rely on such platform-specific 
 behavior, you probably shouldn't be using std.process in the 
 first place - as its goal is to provide high-level cross-platform 
 abstractions for the most common operations, rather than try to 
 cover all features exposed by all operating system APIs.
Not necessarily. Why do you have to deal with this stuff in std.process in the first place? Because during development of the sockets and file APIs in Phobos noone thought about this issue. It should be a Phobos convention that descriptors are closed on exec by default and changed in the few places where sockets and files are created. Someone who uses Posix APIs directly can thus rely on their behavior and std.process can stay ignorant to this platform difference and avoid ugly hacks. -- Marco
Mar 10 2013
next sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Phobos isn't the first and wont be the last to tackle this
issue:

http://www.google.de/search?q=%22set+FD_CLOEXEC%22+OR+%22FD_CLOEXEC+not+set%22+bug

Affected projects range from MySQL to Mozilla.

-- 
Marco
Mar 10 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 10 Mar 2013 13:03:20 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Sun, 10 Mar 2013 17:07:26 +0100
 schrieb "Vladimir Panteleev" <vladimir thecybershadow.net>:

 I think the idea is that if you rely on such platform-specific
 behavior, you probably shouldn't be using std.process in the
 first place - as its goal is to provide high-level cross-platform
 abstractions for the most common operations, rather than try to
 cover all features exposed by all operating system APIs.
I think we can find a mix. Since fork gives you a convenient location to close all open file descriptors, we should do so by default. But if you want standard posix behavior and you want to rely on F_CLOEXEC, you should be able to do that with a flag to spawnProcess. We already have a Config parameter that is already used to control stream closing behavior. We should extend that.
 Not necessarily. Why do you have to deal with this stuff in
 std.process in the first place? Because during development of
 the sockets and file APIs in Phobos noone thought about this
 issue. It should be a Phobos convention that descriptors are
 closed on exec by default and changed in the few places where
 sockets and files are created.
It only becomes a problem for long-living subprograms. For example, if you run an "ls" command as a subprocess, who cares if it keeps open sockets for a fraction of a second? The other real issue is if a child process unknowingly keeps its stdin/stderr/stdout open by having the other end open (Vladimir's situation). But it certainly is a problem that we can and should solve. However, I don't agree with your solution. We should not be infiltrating std.process hacks into all creation of streams. Not only is that difficult to maintain, but it decouples the purpose of code from the actual implementation by quite a bit. A process may never even spawn a child process, or it may call functions that create pipes/threads that DON'T set F_CLOEXEC. Maybe the 3rd party library didn't get that memo. I see no issue with std.process handling the historic flaws in process creation, that is where it belongs IMO. What's nice about it is, with the "close all open descriptors" method, it handles all these cases quite well. We should also give the user a "roll your own" option where it doesn't close these descriptors for you, you must set F_CLOEXEC manually.
 Someone who uses Posix APIs directly can thus rely on their
 behavior and std.process can stay ignorant to this platform
 difference and avoid ugly hacks.
Both ways are ugly hacks. I'd rather have the hacks be in one place, and not require 3rd party libs to comply. -Steve
Mar 10 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Sun, 10 Mar 2013 21:10:24 -0400
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 I think we can find a mix.  Since fork gives you a convenient location to  
 close all open file descriptors, we should do so by default.  But if you  
 want standard posix behavior and you want to rely on F_CLOEXEC, you should  
 be able to do that with a flag to spawnProcess.  We already have a Config  
 parameter that is already used to control stream closing behavior.  We  
 should extend that.

 [...]
 
 However, I don't agree with your solution.  We should not be infiltrating  
 std.process hacks into all creation of streams.  
I don't see it as introducing std.process hacks everywhere but fixing Phobos file handle semantics in a clean way. Just try to look at it from that perspective. My thinking is that it works without any hack at all, by attacking the issue at the _source_ where we can still simply ask Posix to do the right thing(tm). We can level the operating system differences right at the point where we open files, instead of messing with the semantics of spawning sub processes later by closing all but 0, 1 and 2 by default, which is an actual hack.
 Not only is that  
 difficult to maintain, but it decouples the purpose of code from the  
 actual implementation by quite a bit.
Windows has a flag in both locations as well: When you create the (file) handle and when you create a sub process. And a common file handle/descriptor property in all OSs is whether it is inheritable. Whereas no such common ground exists for spawning processes.
 A process may never even spawn a child process,
No harm done in that case.
 or it may call functions that create pipes/threads that  
 DON'T set F_CLOEXEC.  Maybe the 3rd party library didn't get that memo.
Yes and yes. Its not a BUG to do that deliberately and cleanly supported by Windows as you can open stuff with bInheritHandle set in SECURITY_ATTRIBUTES, duplicating that behavior. Consider these changes of perspective: * A hardcore Posix fanatic could just as well argue that Windows code forgetting to set bInheritHandle for all opened files is at fault. Since inheritance should be the default. * You mentioned libs opening file descriptors without Phobos. By the same thinking someone could spawn child processes without Phobos - still inheriting the open descriptors. This is outside the scope of std.process: We have a security relevant property of open files that is supported on all OS, but we don't set it to the same value, which should be: safe by default. (That is why I opened a new thread about it in case you were wondering.)
 I see no issue with std.process handling the historic flaws in process  
 creation, that is where it belongs IMO.
But it also - as a file handle/descriptor property - belongs to creating those.
 What's nice about it is, with the "close all open descriptors" method,
 it handles all these cases quite well.  We should also give the user
 a "roll your own" option where it doesn't close these descriptors for
 you, you must set F_CLOEXEC manually.
Assuming this is the compromise - with the Windows code path using bInheritHandles for CreateProcess - this still leaves us with Phobos creating inheritable handles on Posix and non-inheritable ones on Windows. Where it should be opt-in on both.
 Both ways are ugly hacks. I'd rather have the hacks be in one place,
 and not require 3rd party libs to comply.
 
 -Steve
I've placed some example implementations from other languages in the other thread... -- Marco
Mar 10 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 11 Mar 2013 01:24:17 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Sun, 10 Mar 2013 21:10:24 -0400
 schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 I think we can find a mix.  Since fork gives you a convenient location  
 to
 close all open file descriptors, we should do so by default.  But if you
 want standard posix behavior and you want to rely on F_CLOEXEC, you  
 should
 be able to do that with a flag to spawnProcess.  We already have a  
 Config
 parameter that is already used to control stream closing behavior.  We
 should extend that.

 [...]

 However, I don't agree with your solution.  We should not be  
 infiltrating
 std.process hacks into all creation of streams.
I don't see it as introducing std.process hacks everywhere but fixing Phobos file handle semantics in a clean way. Just try to look at it from that perspective.
Any time you are fixing an OS flaw in user code, it's a hack :)
 Not only is that
 difficult to maintain, but it decouples the purpose of code from the
 actual implementation by quite a bit.
Windows has a flag in both locations as well: When you create the (file) handle and when you create a sub process. And a common file handle/descriptor property in all OSs is whether it is inheritable. Whereas no such common ground exists for spawning processes.
What I mean is, this is an issue with spawning child processes. It belongs in a function that spawns child processes.
 A process may never even spawn a child process,
No harm done in that case.
 or it may call functions that create pipes/threads that
 DON'T set F_CLOEXEC.  Maybe the 3rd party library didn't get that memo.
Yes and yes. Its not a BUG to do that deliberately and cleanly supported by Windows as you can open stuff with bInheritHandle set in SECURITY_ATTRIBUTES, duplicating that behavior. Consider these changes of perspective: * A hardcore Posix fanatic could just as well argue that Windows code forgetting to set bInheritHandle for all opened files is at fault. Since inheritance should be the default.
I don't think that would be a good solution either (setting all handles to inherit on Windows). IMO, D should be as compatible as possible with the target OS. This means: a) libraries written in other languages which D deals with are not run in an environment that is unexpected (e.g. all handles are unexpectedly marked CLOEXEC). b) D's code can run without expecting to have special conditions.
 * You mentioned libs opening file descriptors without Phobos.
   By the same thinking someone could spawn child processes
   without Phobos - still inheriting the open descriptors.
With your solution, they would have to open all their handles WITHOUT using phobos. Likely a huge pain.
 I see no issue with std.process handling the historic flaws in process
 creation, that is where it belongs IMO.
But it also - as a file handle/descriptor property - belongs to creating those.
The property specifically deals with creating processes. It's not a handle property, it's just stored with the handle because it requires one flag per handle, and the space is there. Technically, even Windows has this wrong. Because they don't give a place for you to do the "right thing" per execution of CreateProcess (like fork does), it's possible to have a race where you inadvertently inherit handles you don't mean to. example: thread 1 wants to spawn a process, creates pipes for the standard handles, sets the appropriate ends to inheritable thread 2 wants to spawn a process, creates pipes for the standard handles, sets the appropriate ends to inheritable thread 1 spawns his process, which assigns it's pipes to the standard handles, but also it has inherited thread 2's pipes! This is like Vladimir's problem, but involves a race, so it will only happen once in a blue moon! Unfortunately, we have no recourse for this...
 What's nice about it is, with the "close all open descriptors" method,
 it handles all these cases quite well.  We should also give the user
 a "roll your own" option where it doesn't close these descriptors for
 you, you must set F_CLOEXEC manually.
Assuming this is the compromise - with the Windows code path using bInheritHandles for CreateProcess - this still leaves us with Phobos creating inheritable handles on Posix and non-inheritable ones on Windows. Where it should be opt-in on both.
I don't agree. It should do the default that the OS provides. Variations are available by using OS-specific functions.
 I've placed some example implementations from other languages
 in the other thread...
It's good that there is precedent for your idea. But I don't agree with the design regardless of whether other implementations have done it. -Steve
Mar 11 2013
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Saturday, 9 March 2013 at 16:05:15 UTC, Lars T. Kyllingstad 
wrote:
 1. Make a "special" spawnProcess() function for pipe 
 redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full range 
 of possible file descriptors and close the ones we don't want 
 open.

 The last one would let us keep the current API (and would have 
 the added benefit of cleaning up unused FDs) but I have no idea 
 how it would impact performance.
How about this: Set FD_CLOEXEC on all pipes just after creation, but clear the flag for the relevant pipes before exec?
Mar 09 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 19:51:49 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Saturday, 9 March 2013 at 16:05:15 UTC, Lars T. Kyllingstad wrote:
 1. Make a "special" spawnProcess() function for pipe redirection.
 2. Use the "process object" approach, like Tango and Qt.
 3. After fork(), in the child process, loop over the full range of  
 possible file descriptors and close the ones we don't want open.

 The last one would let us keep the current API (and would have the  
 added benefit of cleaning up unused FDs) but I have no idea how it  
 would impact performance.
How about this: Set FD_CLOEXEC on all pipes just after creation, but clear the flag for the relevant pipes before exec?
This doesn't help if other threads are randomly opening file descriptors. That is a problem I don't think we considered. Unix's design here is very outdated, seems to assume a single threaded app. This does make me thing of another good point, we should unset the FD_CLOEXEC flag on stdout, stdin, and stderr! -Steve
Mar 09 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 10 March 2013 at 02:01:33 UTC, Steven Schveighoffer 
wrote:
 How about this: Set FD_CLOEXEC on all pipes just after 
 creation, but clear the flag for the relevant pipes before 
 exec?
This doesn't help if other threads are randomly opening file descriptors. That is a problem I don't think we considered.
OK, but it will still solve the specific problem with the other end being open.
Mar 12 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 05:58:31 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Sunday, 10 March 2013 at 02:01:33 UTC, Steven Schveighoffer wrote:
 How about this: Set FD_CLOEXEC on all pipes just after creation, but  
 clear the flag for the relevant pipes before exec?
This doesn't help if other threads are randomly opening file descriptors. That is a problem I don't think we considered.
OK, but it will still solve the specific problem with the other end being open.
Yes it does. I'd rather solve both, though. -Steve
Mar 12 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 13:56:28 UTC, Steven Schveighoffer 
wrote:
 OK, but it will still solve the specific problem with the 
 other end being open.
Yes it does. I'd rather solve both, though.
OK. The idea to close all FDs after forking seems to be the best solution so far, although I have some reservations (scaling for high max-FD environments, and it doesn't sound like "the right thing to do"). I was thinking that we could implement both approaches (closing all FDs after forking, and setting FD_CLOEXEC where appropriate), as an escape hatch: if later we suddenly find out that one of them was a horrible idea, we can simply remove it without much consequence.
Mar 12 2013
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 15:02:46 UTC, Vladimir Panteleev 
wrote:
 we could implement
(I use the word "we" very liberally in this thread)
Mar 12 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 12 March 2013 at 15:07:09 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 12 March 2013 at 15:02:46 UTC, Vladimir Panteleev 
 wrote:
 we could implement
(I use the word "we" very liberally in this thread)
Nah, implementing it is easy. Getting the API right is the hard part. So "we" is perfectly OK. :) Lars
Mar 12 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 11:02:45 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 12 March 2013 at 13:56:28 UTC, Steven Schveighoffer wrote:
 OK, but it will still solve the specific problem with the other end  
 being open.
Yes it does. I'd rather solve both, though.
OK. The idea to close all FDs after forking seems to be the best solution so far, although I have some reservations (scaling for high max-FD environments, and it doesn't sound like "the right thing to do").
I have those same reservations on scaling. Anecdotal testing shows it takes about .1 microseconds per call to close on Lars' machine, so it's likely not terrible. The "right thing to do" is impossible since the OS doesn't give us the tools :) The best interface would be a list of file descriptors to keep open as a parameter to exec/CreateProcess. That should have been the original interface on all platforms. Windows ALMOST has this right, as it takes handles for stdin/stdout/stderr, but it requires that you also inherit all other inheritable handles in order to do that...
 I was thinking that we could implement both approaches (closing all FDs  
 after forking, and setting FD_CLOEXEC where appropriate), as an escape  
 hatch: if later we suddenly find out that one of them was a horrible  
 idea, we can simply remove it without much consequence.
Since all the solutions we are talking about are implementation details, not specifically requested by the user, it should be easy to switch from one to the other. -Steve
Mar 12 2013
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 15:19:25 UTC, Steven Schveighoffer 
wrote:
 On Tue, 12 Mar 2013 11:02:45 -0400, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:
 I was thinking that we could implement both approaches 
 (closing all FDs after forking, and setting FD_CLOEXEC where 
 appropriate), as an escape hatch: if later we suddenly find 
 out that one of them was a horrible idea, we can simply remove 
 it without much consequence.
Since all the solutions we are talking about are implementation details, not specifically requested by the user, it should be easy to switch from one to the other.
What I'm worried about is what we can't predict: unexpected side effects. Disabling one approach should have a less drastic impact than replacing it with another.
Mar 12 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 11:24:19 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 What I'm worried about is what we can't predict: unexpected side  
 effects. Disabling one approach should have a less drastic impact than  
 replacing it with another.
If someone depends on the side effects of one approach, it won't matter how less drastic it is, for them it will be bad if we disable it :) I think it's reasonable to expect phobos does what the normal OS functions do. If we add the F_CLOEXEC to open pipes or other file descriptors, then we could never disable that, as people may depend on that. As our current code does NOT do that, we also may break code simply by adding that flag. -Steve
Mar 12 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 12 Mar 2013 11:38:35 -0400
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 On Tue, 12 Mar 2013 11:24:19 -0400, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:
 
 What I'm worried about is what we can't predict: unexpected side  
 effects. Disabling one approach should have a less drastic impact than  
 replacing it with another.
If someone depends on the side effects of one approach, it won't matter how less drastic it is, for them it will be bad if we disable it :) I think it's reasonable to expect phobos does what the normal OS functions do. If we add the F_CLOEXEC to open pipes or other file descriptors, then we could never disable that, as people may depend on that. As our current code does NOT do that, we also may break code simply by adding that flag. -Steve
What makes you think Phobos should carry on all the OS specific quirks? I think a cross-platform library should offer the same behavior on all systems. In this case it can also be seen as covering up for an arguably bad Posix API design. Secondly, unless you do a pure fork(), you wont have the data structures (like File, Socket) available any more in the sub-process to actually use the inherited file descriptors. But even if it breaks code, the affected developers will hopefully understand that the default of not inheriting descriptors by default is safer on the long run. -- Marco
Mar 12 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 13:48:04 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Tue, 12 Mar 2013 11:38:35 -0400
 schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 On Tue, 12 Mar 2013 11:24:19 -0400, Vladimir Panteleev
 <vladimir thecybershadow.net> wrote:

 What I'm worried about is what we can't predict: unexpected side
 effects. Disabling one approach should have a less drastic impact than
 replacing it with another.
If someone depends on the side effects of one approach, it won't matter how less drastic it is, for them it will be bad if we disable it :) I think it's reasonable to expect phobos does what the normal OS functions do. If we add the F_CLOEXEC to open pipes or other file descriptors, then we could never disable that, as people may depend on that. As our current code does NOT do that, we also may break code simply by adding that flag. -Steve
What makes you think Phobos should carry on all the OS specific quirks? I think a cross-platform library should offer the same behavior on all systems. In this case it can also be seen as covering up for an arguably bad Posix API design.
D's design should not preclude what is possible. Someone may have a good reason to use that bad API. What I want to avoid is having to require the user to pay attention to handle inheritance if he doesn't care, but also I don't want Phobos to puke when you decide to (or have to) circumvent phobos when creating file descriptors. The most robust method is to close all the descriptors we don't want to pass, regardless of how they are opened/configured.
 Secondly, unless you do a pure fork(), you wont have the data
 structures (like File, Socket) available any more in the
 sub-process to actually use the inherited file descriptors.
Linux file descriptors are integers. Not quite that hard to pass another file descriptor via number 3 or 4 for example. Windows handles (I think) are guaranteed to be the same value in the child process. But they aren't sequential integers starting at 0, so you would have to pass somehow, perhaps via parameters, what the handle values are. I argued early on that the handles to spawnProcess should be unbuffered because the buffer will not be passed. This can result in very weird output if both parent and child keep the same handle open. But File is the standard structure for Phobos, so that was used.
 But even if it breaks code, the affected developers will
 hopefully understand that the default of not inheriting
 descriptors by default is safer on the long run.
This is not a good position to have as the should-be-agnostic standard library. Phobos should make the most useful/common/safe idioms the default, but make the non-safe ones possible. The idea of marking all descriptors as close on exec puts unnecessary burden on those who want to use straight fork/exec and want to pass Phobos-created descriptors. Having spawnProcess depend on those flag settings puts unnecessary burden on people who use non-phobos calls to open file descriptors and want to use spawnProcess. I'd rather avoid that. Both methods are very similar, I just feel calling close on all descriptors between fork and exec is more effective, and puts the burden on spawnProcess instead of the user or other parts of Phobos. -Steve
Mar 12 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 12 Mar 2013 14:37:31 -0400
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 On Tue, 12 Mar 2013 13:48:04 -0400, Marco Leise <Marco.Leise gmx.de> wrote:
 
 What makes you think Phobos should carry on all the OS
 specific quirks? I think a cross-platform library should
 offer the same behavior on all systems. In this case it can
 also be seen as covering up for an arguably bad Posix API
 design.
D's design should not preclude what is possible. Someone may have a good reason to use that bad API.
I don't want to stop anyone from doing that. I'm just trying not to write book pages here. :) It's good that we have "unwrappers" in "File": getFP (for the C stream) and fileno (for the file descriptor) as well as the open() syscall.
 What I want to avoid is having to require the user to pay attention to  
 handle inheritance if he doesn't care, but also I don't want Phobos to  
 puke when you decide to (or have to) circumvent phobos when creating file  
 descriptors.
Exactly! And the thousands of bug reports (literally!) in other software seem to indicate that an uninstructed human being assumes that files don't remain open in sub-processes. We aren't even talking about fork here, but really new processes created with the exec family of calls. Also common logic suggests that this behavior would not make sense, since what can a sub-process do with open files when I didn't give it the descriptor numbers?
 The most robust method is to close all the descriptors we don't want to  
 pass, regardless of how they are opened/configured.
Ok I give in. After reading the Windows API documentation and also the SysV daemon writing tips, I had to come to the same conclusion. (In addition to not creating leaking file descriptors in Phobos in the first place that is.)
 Linux file descriptors are integers.  Not quite that hard to pass another  
 file descriptor via number 3 or 4 for example.
No, but at that point you no longer ignorant about leaking descriptors, check what the Phobos file abstraction layer does and can unset FD_CLOEXEC: int fileno = file.fileno(); int fdflags = fcntl(fileno, F_GETFD); fcntl(fileno, F_SETFD, fdflags & ~FD_CLOEXEC); or use syscalls directly.
 This is not a good position to have as the should-be-agnostic standard  
 library.  Phobos should make the most useful/common/safe idioms the  
 default, but make the non-safe ones possible.  The idea of marking all  
 descriptors as close on exec puts unnecessary burden on those who want to  
 use straight fork/exec and want to pass Phobos-created descriptors.   
We agree on the "possible" and also on the most "useful/common/safe" aspect. But the conclusions that we draw are different. To you (as I read it) Phobos offers functionality to create Posix file descriptors, to me it only creates file abstractions that encapsulate and level OS quirks. It looks to me like the issue was just not considered when writing Phobos and we can do it like Ruby and add that property to files that allows them to stay open in sub-processes. But this should not be a Posix only option. SetFileSecurity should do the job on Windows just as well.
 Having spawnProcess depend on those flag settings puts unnecessary burden  
 on people who use non-phobos calls to open file descriptors and want to  
 use spawnProcess.  I'd rather avoid that.
 
 Both methods are very similar, I just feel calling close on all  
 descriptors between fork and exec is more effective, and puts the burden  
 on spawnProcess instead of the user or other parts of Phobos.
 
 -Steve
Yeah, alright. Close them all by default. But keep in mind what I said in the last post: as likely as it is that people don't use Phobos to open files, a library might use fork/exec directly and leak our Phobos files. The problem should not be tackled in std.process alone. I'll probably try myself on a pull request in the coming days that closes Phobos file descriptors on exec and see about a simple system-agnostic(!) property to enable it. After all, why should Windows folks not be able to pass handles to sub-processes? -- Marco
Mar 12 2013
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 16:23:33 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Exactly! And the thousands of bug reports (literally!) in other
 software seem to indicate that an uninstructed human being
 assumes that files don't remain open in sub-processes. We
 aren't even talking about fork here, but really new processes
 created with the exec family of calls.
Given how awesome spawnProcess is, those problems should be very rare, who would use fork and exec? :)
 Also common logic suggests that this behavior would not make
 sense, since what can a sub-process do with open files when I
 didn't give it the descriptor numbers?
On unix at least, there can be an agreement that certain descriptors are passed as certain numbers. The code that runs between fork and exec does the plumbing. We already have agreement on what 0, 1, and 2 are. However, I will agree that this is NOT a common thing to do. I also agree that logic (and practicality) suggests the default on Unix should have been to not inherit descriptors. In fact, you shouldn't really unset the close on exec flag except after calling fork to avoid racing with other threads that may be calling fork/exec at the same time.
 Linux file descriptors are integers.  Not quite that hard to pass  
 another
 file descriptor via number 3 or 4 for example.
No, but at that point you no longer ignorant about leaking descriptors, check what the Phobos file abstraction layer does and can unset FD_CLOEXEC:
Yeah, but why were we talking about unintended leaking? I thought we were talking about intended leaking, and you were saying you can't use the inherited descriptors: "Secondly, unless you do a pure fork(), you wont have the data structures (like File, Socket) available any more in the sub-process to actually use the inherited file descriptors." I was contending it was quite possible to intentionally pass the descriptors as certain numbers to the child process, and then re-wrap them in new File objects.
 int fileno = file.fileno();
 int fdflags = fcntl(fileno, F_GETFD);
 fcntl(fileno, F_SETFD, fdflags & ~FD_CLOEXEC);
We actually need to do this in order to properly pass the standard handles.
 This is not a good position to have as the should-be-agnostic standard
 library.  Phobos should make the most useful/common/safe idioms the
 default, but make the non-safe ones possible.  The idea of marking all
 descriptors as close on exec puts unnecessary burden on those who want  
 to
 use straight fork/exec and want to pass Phobos-created descriptors.
We agree on the "possible" and also on the most "useful/common/safe" aspect. But the conclusions that we draw are different. To you (as I read it) Phobos offers functionality to create Posix file descriptors, to me it only creates file abstractions that encapsulate and level OS quirks. It looks to me like the issue was just not considered when writing Phobos and we can do it like Ruby and add that property to files that allows them to stay open in sub-processes. But this should not be a Posix only option. SetFileSecurity should do the job on Windows just as well.
Having a property to fetch and set inheritance certainly would be a good addition to Phobos. I still think we should do what the OS specifies by default. There is no real "right" answer there. -Steve
Mar 12 2013
prev sibling parent Johannes Pfau <nospam example.com> writes:
Am Tue, 12 Mar 2013 11:19:25 -0400
schrieb "Steven Schveighoffer" <schveiguy yahoo.com>:

 On Tue, 12 Mar 2013 11:02:45 -0400, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:
 
 On Tuesday, 12 March 2013 at 13:56:28 UTC, Steven Schveighoffer
 wrote:
 OK, but it will still solve the specific problem with the other
 end being open.
Yes it does. I'd rather solve both, though.
OK. The idea to close all FDs after forking seems to be the best solution so far, although I have some reservations (scaling for high max-FD environments, and it doesn't sound like "the right thing to do").
I have those same reservations on scaling.
Daemon implementations on posix systems should the same thing according to best-practices: http://0pointer.de/public/systemd-man/daemon.html (step 1) libdaemon also does that: http://git.0pointer.de/?p=libdaemon.git;a=blob;f=libdaemon/dfork.c;h=783033fe290e715df562f20d292adfc9f26e2e3a;hb=HEAD#l491 systemd probably as well so it really seems there is no better solution.
Mar 12 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 11:05:14 -0500, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Wednesday, 6 March 2013 at 16:45:51 UTC, Steven Schveighoffer wrote:
 On Tue, 05 Mar 2013 17:38:09 -0500, Vladimir Panteleev  
 <vladimir thecybershadow.net> wrote:

 By the way, I should mention that I ran into several issues while  
 trying to come up with the above example. The test program does not  
 work on Windows, for some reason I get the exception:

 std.process2.ProcessException std\process2.d(494): Failed to spawn new  
 process (The parameter is incorrect.)
I think Lars is on that.
I'm going to need som help with this one. I only have Linux on my computer, and I can't reproduce the bug in Wine.
Tried as well. I have only a 32-bit license for Windows XP, so I don't have a 64-bit VM to test with (this is not wine, but vmware, should be exactly the same as running on a real windows box). I gave away my Windows 7 64-bit box :( Anyway, on 32-bit XP I get a successful run: 100 lines of stdout, 100 lines of stderr I can possibly try it on a laptop from work. But not until Monday. So it probably is a 64-bit-only issue. I know you just added this part Lars, and it uses microsoft's runtime. Very different from the DMC runtime. But both should use the same OS call. Will take a closer look at the code around that line. Vladimir, can you try compiling 32-bit windows and see if it works for you, just to confirm? -Steve
Mar 09 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
 So it probably is a 64-bit-only issue.  I know you just added this part  
 Lars, and it uses microsoft's runtime.  Very different from the DMC  
 runtime.  But both should use the same OS call.  Will take a closer look  
 at the code around that line.
Looks fine to me. Doing some searching online, the overwhelming results for that specific error are for failed ANT build or java execution due to a huge class path (which I think is either passed on the command line or via environment?) Try shortening your executable path/environment? Just a guess... -Steve
Mar 09 2013
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 10 March 2013 at 03:06:03 UTC, Steven Schveighoffer 
wrote:
 Try shortening your executable path/environment?
Didn't help.
Mar 09 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 10 March 2013 at 02:54:44 UTC, Steven Schveighoffer 
wrote:
 Vladimir, can you try compiling 32-bit windows and see if it 
 works for you, just to confirm?
I'm seeing the same exception with both 32 and 64 bit, Steven. I guess it must be something specific to my system, like the HANDLE_FLAG_INHERIT stuff.
Mar 09 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 22:07:26 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Sunday, 10 March 2013 at 02:54:44 UTC, Steven Schveighoffer wrote:
 Vladimir, can you try compiling 32-bit windows and see if it works for  
 you, just to confirm?
I'm seeing the same exception with both 32 and 64 bit, Steven. I guess it must be something specific to my system, like the HANDLE_FLAG_INHERIT stuff.
IIRC, you were able to get something working there with std.process2. Is that example still working? I'm trying to see what could be different that's causing this. I can try a test on a win7 box monday. Likely the issue is xp vs. win7. -Steve
Mar 09 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 10 March 2013 at 03:48:36 UTC, Steven Schveighoffer 
wrote:
 On Sat, 09 Mar 2013 22:07:26 -0500, Vladimir Panteleev 
 <vladimir thecybershadow.net> wrote:

 On Sunday, 10 March 2013 at 02:54:44 UTC, Steven Schveighoffer 
 wrote:
 Vladimir, can you try compiling 32-bit windows and see if it 
 works for you, just to confirm?
I'm seeing the same exception with both 32 and 64 bit, Steven. I guess it must be something specific to my system, like the HANDLE_FLAG_INHERIT stuff.
IIRC, you were able to get something working there with std.process2. Is that example still working? I'm trying to see what could be different that's causing this.
Yes, Steven. The ls test program still works: http://dump.thecybershadow.net/935b0c4a47ce367313efcc1806f75076/lstest.d
Mar 09 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 09 Mar 2013 23:52:26 -0500, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Sunday, 10 March 2013 at 03:48:36 UTC, Steven Schveighoffer wrote:
 IIRC, you were able to get something working there with std.process2.   
 Is that example still working?  I'm trying to see what could be  
 different that's causing this.
Yes, Steven. The ls test program still works: http://dump.thecybershadow.net/935b0c4a47ce367313efcc1806f75076/lstest.d
OK, I was able to reproduce on a windows 7 box, and I found the problem. Really dumb. It was the difference in the spawnProcess calls between the two programs that gave me the hint. So the environment pointer to CreateProcess is a list of null-separated "var=value" strings. The last string is followed by an additional null, so the function can identify the end of the list. HOWEVER, if you have NO variables, there is no double null, because there isn't the null from the last variable. However, on XP, a single null character is fine, whereas on windows 7, it REQUIRES an extra null character, even if you didn't have any variables. Adding the extra null character in toEnvz fixed the problem. Now, that actually brings up another bug I think. pipeProcess is sending a null to spawnProcess for the environment. However, because AA's are structs, and null is the same as an empty AA, it gets translated into "kill the entire environment" for the child process. I think this is wrong. But at the same time, what if you DID want to kill the entire environment? Should we even support that? Either way I don't think pipeProcess should kill the environment, so that needs to be changed. Passing null to CreateProcessW copies the parent's environment, I think that should be the ultimate default when the environment is not specified, even for pipeProcess. But what should we do if the spawnProcess overload that takes an environment receives a null environment? My instinct is to detect an empty AA, and interpret that as "copy parent environment," even Lars seems to have interpreted it that way (maybe it works that way on Linux as it's written now?) in how he wrote pipeProcess. I think we can forgo the pull request, the solution is simply to add another '\0', that will handle the case where the environment is empty and be a noop for when the environment has stuff in it. But maybe we want to make toEnvz return null if it gets an empty AA to avoid killing the environment? I have no idea what the right answer is. -Steve
Mar 11 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 11 March 2013 at 23:21:18 UTC, Steven Schveighoffer 
wrote:
 OK, I was able to reproduce on a windows 7 box, and I found the 
 problem.  Really dumb.  It was the difference in the 
 spawnProcess calls between the two programs that gave me the 
 hint.
Awesome! :)
 So the environment pointer to CreateProcess is a list of 
 null-separated "var=value" strings.  The last string is 
 followed by an additional null, so the function can identify 
 the end of the list.

 HOWEVER, if you have NO variables, there is no double null, 
 because there isn't the null from the last variable.  However, 
 on XP, a single null character is fine, whereas on windows 7, 
 it REQUIRES an extra null character, even if you didn't have 
 any variables.  Adding the extra null character in toEnvz fixed 
 the problem.
I never would have thought of trying that, especially considering that the CreateProcess() documentation is rather vague on this point. It says: "A Unicode environment block is terminated by four zero bytes: two for the last string, two more to terminate the block." (This confused me slightly, until I remembered that one zero wchar is two zero bytes.) It does not explicitly mention the case where there are *no* environment variables, but from the above statement, I had inferred that one zero character would suffice, considering there is no "last string" to terminate. But it seems I was wrong. :)
 Now, that actually brings up another bug I think.  pipeProcess 
 is sending a null to spawnProcess for the environment.  
 However, because AA's are structs, and null is the same as an 
 empty AA, it gets translated into "kill the entire environment" 
 for the child process.  I think this is wrong.  But at the same 
 time, what if you DID want to kill the entire environment?  
 Should we even support that?  Either way I don't think 
 pipeProcess should kill the environment, so that needs to be 
 changed.

 Passing null to CreateProcessW copies the parent's environment, 
 I think that should be the ultimate default when the 
 environment is not specified, even for pipeProcess.  But what 
 should we do if the spawnProcess overload that takes an 
 environment receives a null environment?  My instinct is to 
 detect an empty AA, and interpret that as "copy parent 
 environment," even Lars seems to have interpreted it that way 
 (maybe it works that way on Linux as it's written now?) in how 
 he wrote pipeProcess.

 I think we can forgo the pull request, the solution is simply 
 to add another '\0', that will handle the case where the 
 environment is empty and be a noop for when the environment has 
 stuff in it.  But maybe we want to make toEnvz return null if 
 it gets an empty AA to avoid killing the environment?  I have 
 no idea what the right answer is.
Nice catch! Fortunately, the only bug here is that I've specified a null parameter in the spawn call in pipeProcessImpl(). If you look at the various spawnProcess() overloads, you'll see that they are designed as follows: If you omit the AA parameter altogether, the child inherits the parent's environment. A null pointer is passed as 'envz' to spawnProcessImpl(), which in turn passes this straight to CreateProcess() on Windows and replaces it by 'environ' on POSIX. If you do specify the AA parameter, it is passed through toEnvz() on its way to spawnProcessImpl(). If the AA is empty/null, toEnvz() will create an empty (but non-null) environment block, and the child's environment will be empty. I think the bug stems from the fact that, at some intermediate stage of the development, pipeProcessImpl() used to call spawnProcessImpl() directly. I later changed this to spawnProcess(), but forgot to remove the null parameter. I'll simply remove it, and everything will work as intended. Initially, I wrote spawnProcess() the way you suggest, i.e. that a passing a null AA is equivalent to omitting it. But then we are left with no simple way to explicitly clear the child's environment. We could make a distinction between a "null" and an "empty" AA, but making a non-null empty AA is a hassle. I think it is better the way it is now. I don't think we need to support clearing the child's environment in pipeProcess(). I suspect it is a rare need, and the user will just have to deal with spawnProcess() directly in that case. (It's like how there are all kinds of different exec*() functions in C, but only one, simple, popen() function.) Lars
Mar 12 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 03:31:30 -0400, Lars T. Kyllingstad  
<public kyllingen.net> wrote:


 Nice catch!  Fortunately, the only bug here is that I've specified a  
 null parameter in the spawn call in pipeProcessImpl().  If you look at  
 the various spawnProcess() overloads, you'll see that they are designed  
 as follows:

 If you omit the AA parameter altogether, the child inherits the parent's  
 environment.  A null pointer is passed as 'envz' to spawnProcessImpl(),  
 which in turn passes this straight to CreateProcess() on Windows and  
 replaces it by 'environ' on POSIX.

 If you do specify the AA parameter, it is passed through toEnvz() on its  
 way to spawnProcessImpl().  If the AA is empty/null, toEnvz() will  
 create an empty (but non-null) environment block, and the child's  
 environment will be empty.
I understand that point. I am a little concerned, however, that passing null as env results in clearing the child environment. These concerns are simply that the most common desire is to inherit the environment, and that there is no parameter that simply says "inherit parent environment," you have to call a different function. I suppose it's no different than exec, which you have to call the right function depending on what you want. So this sounds fine. The toEnvz still should be fixed to add an extra null, I'm assuming you're doing that right? :)
 I think the bug stems from the fact that, at some intermediate stage of  
 the development, pipeProcessImpl() used to call spawnProcessImpl()  
 directly.  I later changed this to spawnProcess(), but forgot to remove  
 the null parameter.  I'll simply remove it, and everything will work as  
 intended.
OK, this makes sense.
 Initially, I wrote spawnProcess() the way you suggest, i.e. that a  
 passing a null AA is equivalent to omitting it.  But then we are left  
 with no simple way to explicitly clear the child's environment.  We  
 could make a distinction between a "null" and an "empty" AA, but making  
 a non-null empty AA is a hassle.  I think it is better the way it is now.
OK
 I don't think we need to support clearing the child's environment in  
 pipeProcess().  I suspect it is a rare need, and the user will just have  
 to deal with spawnProcess() directly in that case.  (It's like how there  
 are all kinds of different exec*() functions in C, but only one, simple,  
 popen() function.)
Agree. -Steve
Mar 12 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 12 March 2013 at 14:10:34 UTC, Steven Schveighoffer 
wrote:
 On Tue, 12 Mar 2013 03:31:30 -0400, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:
 If you omit the AA parameter altogether, the child inherits 
 the parent's environment.  A null pointer is passed as 'envz' 
 to spawnProcessImpl(), which in turn passes this straight to 
 CreateProcess() on Windows and replaces it by 'environ' on 
 POSIX.

 If you do specify the AA parameter, it is passed through 
 toEnvz() on its way to spawnProcessImpl().  If the AA is 
 empty/null, toEnvz() will create an empty (but non-null) 
 environment block, and the child's environment will be empty.
I understand that point. I am a little concerned, however, that passing null as env results in clearing the child environment. These concerns are simply that the most common desire is to inherit the environment, and that there is no parameter that simply says "inherit parent environment," you have to call a different function.
I'd be very interested to hear if you have a suggestion for a better way to do it, keeping in mind that there needs to be *some* way to clear the environment too.
 I suppose it's no different than exec, which you have to call 
 the right function depending on what you want.

 So this sounds fine.  The toEnvz still should be fixed to add 
 an extra null, I'm assuming you're doing that right? :)
Yes. I have some local modifications based on the discussion here, which I haven't pushed yet. Waiting for you guys to finish debating the file descriptor issue. ;) Lars
Mar 12 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 17:18:43 -0400, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Tuesday, 12 March 2013 at 14:10:34 UTC, Steven Schveighoffer wrote:
 On Tue, 12 Mar 2013 03:31:30 -0400, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:
 If you omit the AA parameter altogether, the child inherits the  
 parent's environment.  A null pointer is passed as 'envz' to  
 spawnProcessImpl(), which in turn passes this straight to  
 CreateProcess() on Windows and replaces it by 'environ' on POSIX.

 If you do specify the AA parameter, it is passed through toEnvz() on  
 its way to spawnProcessImpl().  If the AA is empty/null, toEnvz() will  
 create an empty (but non-null) environment block, and the child's  
 environment will be empty.
I understand that point. I am a little concerned, however, that passing null as env results in clearing the child environment. These concerns are simply that the most common desire is to inherit the environment, and that there is no parameter that simply says "inherit parent environment," you have to call a different function.
I'd be very interested to hear if you have a suggestion for a better way to do it, keeping in mind that there needs to be *some* way to clear the environment too.
Sadly, no I don't. I had hoped [] would allocate an empty AA, but it fails to compile. Note that you can "hack" it by setting a single environment variable which nobody will ever use. i.e. spawnProcess("blah.exe", ["_____":"_____"]); But that is really, really ugly.
 I suppose it's no different than exec, which you have to call the right  
 function depending on what you want.

 So this sounds fine.  The toEnvz still should be fixed to add an extra  
 null, I'm assuming you're doing that right? :)
Yes. I have some local modifications based on the discussion here, which I haven't pushed yet. Waiting for you guys to finish debating the file descriptor issue. ;)
I think all are in agreement at this point that closing the files between fork and exec is a good solution. Whether or not to ALSO set F_CLOEXEC wherever Phobos opens a file descriptor is an additional matter. As a fallback to standard unix behavior, we can have a Config option that says "don't do the close thing". -Steve
Mar 12 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer 
wrote:
 On Tue, 12 Mar 2013 17:18:43 -0400, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 On Tuesday, 12 March 2013 at 14:10:34 UTC, Steven 
 Schveighoffer wrote:
I think all are in agreement at this point that closing the files between fork and exec is a good solution. Whether or not to ALSO set F_CLOEXEC wherever Phobos opens a file descriptor is an additional matter.
Yeah, I've read through the discussion now, and I think I agree. I dislike the idea of looping over an unknown number of file descriptors, but it is both the safest and least disruptive strategy.
 As a fallback to standard unix behavior, we can have a Config 
 option that says "don't do the close thing".
Config.dontDoTheCloseThing it is. ;) Lars
Mar 12 2013
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer 
wrote:
 I'd be very interested to hear if you have a suggestion for a 
 better way to do it, keeping in mind that there needs to be 
 *some* way to clear the environment too.
Sadly, no I don't. I had hoped [] would allocate an empty AA, but it fails to compile. Note that you can "hack" it by setting a single environment variable which nobody will ever use. i.e. spawnProcess("blah.exe", ["_____":"_____"]); But that is really, really ugly.
How about this: property string[string] emptyEnvironment() { string[string] result; result["a"] = "a"; result.remove("a"); assert(result && result.length == 0); return result; } (can be cached to avoid allocating each time)
Mar 12 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Mar 13, 2013 at 04:27:21AM +0100, Vladimir Panteleev wrote:
 On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer
 wrote:
I'd be very interested to hear if you have a suggestion for a
better way to do it, keeping in mind that there needs to be
*some* way to clear the environment too.
Sadly, no I don't. I had hoped [] would allocate an empty AA, but it fails to compile. Note that you can "hack" it by setting a single environment variable which nobody will ever use. i.e. spawnProcess("blah.exe", ["_____":"_____"]); But that is really, really ugly.
How about this: property string[string] emptyEnvironment() { string[string] result; result["a"] = "a"; result.remove("a"); assert(result && result.length == 0); return result; } (can be cached to avoid allocating each time)
I like this idea. T -- It won't be covered in the book. The source code has to be useful for something, after all. -- Larry Wall
Mar 12 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Wednesday, 13 March 2013 at 03:27:25 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer 
 wrote:
 I'd be very interested to hear if you have a suggestion for a 
 better way to do it, keeping in mind that there needs to be 
 *some* way to clear the environment too.
Sadly, no I don't. I had hoped [] would allocate an empty AA, but it fails to compile. Note that you can "hack" it by setting a single environment variable which nobody will ever use. i.e. spawnProcess("blah.exe", ["_____":"_____"]); But that is really, really ugly.
How about this: property string[string] emptyEnvironment() { string[string] result; result["a"] = "a"; result.remove("a"); assert(result && result.length == 0); return result; } (can be cached to avoid allocating each time)
That's a lot better than ["____":"_____"], at least. :) But still, the difference between a null AA and an empty AA is still very subtle, and I am hesitant to design an API that depends on it. We'd have to explain to the users that "ok, so there are two kinds of empty AAs: the ones you've done nothing with, and the ones you've added and removed a value from..." Furthermore, the language spec doesn't seem to mention "null" in relation to AAs. Shouldn't the difference between null and empty then be treated as an implementation detail? Can we even be sure that "aa is null" will work in two years? Lars
Mar 12 2013
next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 13.03.2013 07:31, schrieb Lars T. Kyllingstad:
 On Wednesday, 13 March 2013 at 03:27:25 UTC, Vladimir Panteleev
 wrote:
 On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer
 wrote:
 I'd be very interested to hear if you have a suggestion for a
 better way to do it, keeping in mind that there needs to be
 *some* way to clear the environment too.
Sadly, no I don't. I had hoped [] would allocate an empty AA, but it fails to compile. Note that you can "hack" it by setting a single environment variable which nobody will ever use. i.e. spawnProcess("blah.exe", ["_____":"_____"]); But that is really, really ugly.
How about this: property string[string] emptyEnvironment() { string[string] result; result["a"] = "a"; result.remove("a"); assert(result && result.length == 0); return result; } (can be cached to avoid allocating each time)
That's a lot better than ["____":"_____"], at least. :) But still, the difference between a null AA and an empty AA is still very subtle, and I am hesitant to design an API that depends on it. We'd have to explain to the users that "ok, so there are two kinds of empty AAs: the ones you've done nothing with, and the ones you've added and removed a value from..." Furthermore, the language spec doesn't seem to mention "null" in relation to AAs. Shouldn't the difference between null and empty then be treated as an implementation detail? Can we even be sure that "aa is null" will work in two years? Lars
why not differentiate on callsite? like environment_usage = { PARENT_ENVIRONMENT, NONE_ENVIRONMENT, // which differs from empty given environment GIVEN_ENVIRONMENT } spawnProcess(process,parameter,environment_usage = PARENT_ENVIRONMENT, environemnt = null) it feels very wrong to put the environment "usage" type in any way into the environment-abstraction itself (by occupying null or empty...) +some nice helpers spawnProcessWithParentEnvironment((process,parameter) spawnProcessWithoutEnvironment((process,parameter) spawnProcessWithEnvironment((process,parameter,environment=...) woulnd't that be much clearer? the other way could be an spawnProcess(process,parameter,environment=use_parent_environment()); with parent-environment selector spawnProcess(process,parameter,environment=given_environment(environment));
Mar 12 2013
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 13 Mar 2013 02:45:57 -0400, dennis luehring <dl.soluz gmx.net>
wrote:

 Am 13.03.2013 07:31, schrieb Lars T. Kyllingstad:
 On Wednesday, 13 March 2013 at 03:27:25 UTC, Vladimir Panteleev
 wrote:
 On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer
 wrote:
 I'd be very interested to hear if you have a suggestion for a
 better way to do it, keeping in mind that there needs to be
 *some* way to clear the environment too.
Sadly, no I don't. I had hoped [] would allocate an empty AA, but it fails to compile. Note that you can "hack" it by setting a single environment variable which nobody will ever use. i.e. spawnProcess("blah.exe", ["_____":"_____"]); But that is really, really ugly.
How about this: property string[string] emptyEnvironment() { string[string] result; result["a"] = "a"; result.remove("a"); assert(result && result.length == 0); return result; } (can be cached to avoid allocating each time)
That's a lot better than ["____":"_____"], at least. :) But still, the difference between a null AA and an empty AA is still very subtle, and I am hesitant to design an API that depends on it. We'd have to explain to the users that "ok, so there are two kinds of empty AAs: the ones you've done nothing with, and the ones you've added and removed a value from..." Furthermore, the language spec doesn't seem to mention "null" in relation to AAs. Shouldn't the difference between null and empty then be treated as an implementation detail? Can we even be sure that "aa is null" will work in two years? Lars
why not differentiate on callsite? like environment_usage = { PARENT_ENVIRONMENT, NONE_ENVIRONMENT, // which differs from empty given environment GIVEN_ENVIRONMENT } spawnProcess(process,parameter,environment_usage = PARENT_ENVIRONMENT, environemnt = null) it feels very wrong to put the environment "usage" type in any way into the environment-abstraction itself (by occupying null or empty...) +some nice helpers spawnProcessWithParentEnvironment((process,parameter) spawnProcessWithoutEnvironment((process,parameter) spawnProcessWithEnvironment((process,parameter,environment=...) woulnd't that be much clearer? the other way could be an spawnProcess(process,parameter,environment=use_parent_environment()); with parent-environment selector spawnProcess(process,parameter,environment=given_environment(environment));
Hm.. I think I actually like this. I hate to have feature creep at this point, but one kind of annoying thing is, if you want to *add* to the current environment, it is a multi-step process: auto curenv = environment.toAA; curenv["x"] = "y"; spawnProcess("helloworld", curenv); But with something similar to Dennis' idea, we have a possible way to do that without making a copy of the current environment into an AA and adding: struct EnvironmentArg { this(string[string] env, bool useParent=false) { this.env = env; this.useParent = useParent;} this(bool useParent) {this.useParent = useParent;} string[string] env; bool useParent; } spawnProcess("helloworld", EnvironmentArg(["x":"y"], true)); // use parent environment, add x=y spawnProcess("helloworld", EnvironmentArg(["x":"y"])); // replace environment with x=y spawnProcess("helloworld", EnvironmentArg(false)); // use empty environment spawnProcess("helloworld", EnvironmentArg(true)); // use parent environment exactly EnvironmentArg should probably have better name, and I would recommend some global functions that make common things, like: EnvironmentArg emptyEnvironment() { return EnvironmentArg(null, false);} EnvironmentArg parentEnvironment() { return EnvironmentArg(null, true);} Like? Hate? -Steve
Mar 13 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Wednesday, 13 March 2013 at 20:26:44 UTC, Steven Schveighoffer 
wrote:
 I hate to have feature creep at this point, but one kind of 
 annoying thing
 is, if you want to *add* to the current environment, it is a 
 multi-step
 process:

 auto curenv = environment.toAA;
 curenv["x"] = "y";
 spawnProcess("helloworld", curenv);

 But with something similar to Dennis' idea, we have a possible 
 way to do
 that without making a copy of the current environment into an 
 AA and
 adding:

 struct EnvironmentArg
 {
      this(string[string] env, bool useParent=false) { this.env 
 = env;
 this.useParent = useParent;}
      this(bool useParent) {this.useParent = useParent;}
      string[string] env;
      bool useParent;
 }

 spawnProcess("helloworld", EnvironmentArg(["x":"y"], true)); // 
 use parent
 environment, add x=y
 spawnProcess("helloworld", EnvironmentArg(["x":"y"])); // 
 replace
 environment with x=y
 spawnProcess("helloworld", EnvironmentArg(false)); // use empty 
 environment
 spawnProcess("helloworld", EnvironmentArg(true)); // use parent
 environment exactly

 EnvironmentArg should probably have better name, and I would 
 recommend
 some global functions that make common things, like:

 EnvironmentArg emptyEnvironment() { return EnvironmentArg(null, 
 false);}
 EnvironmentArg parentEnvironment() { return 
 EnvironmentArg(null, true);}

 Like? Hate?
Hmm.. what if spawnProcess() takes a normal string[string] like it does now, but we add a flag to Config that determines whether it is merged with the parent's environment or not? string[string] myEnv = [ "foo" : "bar" ]; spawnProcess("helloworld", null); // Parent's env spawnProcess("helloworld", myEnv); // Parent's env + myEnv spawnProcess("helloworld", null, ..., Config.clearEnv); // Empty env spawnProcess("helloworld", myEnv, ..., Config.clearEnv); // Only myEnv Lars
Mar 13 2013
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Mar 13, 2013 at 09:43:59PM +0100, Lars T. Kyllingstad wrote:
 On Wednesday, 13 March 2013 at 20:26:44 UTC, Steven Schveighoffer
 wrote:
[...]
But with something similar to Dennis' idea, we have a possible way to
do that without making a copy of the current environment into an AA
and adding:

struct EnvironmentArg
{
     this(string[string] env, bool useParent=false) { this.env =
env;
this.useParent = useParent;}
     this(bool useParent) {this.useParent = useParent;}
     string[string] env;
     bool useParent;
}

spawnProcess("helloworld", EnvironmentArg(["x":"y"], true)); // use parent
environment, add x=y
spawnProcess("helloworld", EnvironmentArg(["x":"y"])); // replace environment
with x=y
spawnProcess("helloworld", EnvironmentArg(false)); // use empty environment
spawnProcess("helloworld", EnvironmentArg(true)); // use parent environment
exactly

EnvironmentArg should probably have better name, and I would
recommend some global functions that make common things, like:

EnvironmentArg emptyEnvironment() { return EnvironmentArg(null, false);}
EnvironmentArg parentEnvironment() { return EnvironmentArg(null, true);}

Like? Hate?
Hmm.. what if spawnProcess() takes a normal string[string] like it does now, but we add a flag to Config that determines whether it is merged with the parent's environment or not? string[string] myEnv = [ "foo" : "bar"]; spawnProcess("helloworld", null); // Parent's env spawnProcess("helloworld", myEnv); // Parent's env + myEnv spawnProcess("helloworld", null, ..., Config.clearEnv); // Empty env spawnProcess("helloworld", myEnv, ..., Config.clearEnv); // Only myEnv
[...] +1. I like this idea. Makes code more self-documenting, which is a good thing. Alternatively: struct useParentEnv { string[string] aa; } struct newEnv { string[string] aa; } spawnProcess(E)(string program, E env, ...) { static if (is(E == useParentEnv)) // inherit values from current environment else // don't inherit from current environment } spawnProcess("helloworld", useParentEnv(["a": "b", ... ]), ...); spawnProcess("helloworld", newEnv(["a": "b", ... ]), ...); Of course, rename useParentEnv and newEnv to something more suitable. T -- "You know, maybe we don't *need* enemies." "Yeah, best friends are about all I can take." -- Calvin & Hobbes
Mar 13 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Wednesday, 13 March 2013 at 20:44:00 UTC, Lars T. Kyllingstad
wrote:
 On Wednesday, 13 March 2013 at 20:26:44 UTC, Steven 
 Schveighoffer wrote:
 I hate to have feature creep at this point, but one kind of 
 annoying thing
 is, if you want to *add* to the current environment, it is a 
 multi-step
 process:

 auto curenv = environment.toAA;
 curenv["x"] = "y";
 spawnProcess("helloworld", curenv);

 But with something similar to Dennis' idea, we have a possible 
 way to do
 that without making a copy of the current environment into an 
 AA and
 adding:

 struct EnvironmentArg
 {
     this(string[string] env, bool useParent=false) { this.env 
 = env;
 this.useParent = useParent;}
     this(bool useParent) {this.useParent = useParent;}
     string[string] env;
     bool useParent;
 }

 spawnProcess("helloworld", EnvironmentArg(["x":"y"], true)); 
 // use parent
 environment, add x=y
 spawnProcess("helloworld", EnvironmentArg(["x":"y"])); // 
 replace
 environment with x=y
 spawnProcess("helloworld", EnvironmentArg(false)); // use 
 empty environment
 spawnProcess("helloworld", EnvironmentArg(true)); // use parent
 environment exactly

 EnvironmentArg should probably have better name, and I would 
 recommend
 some global functions that make common things, like:

 EnvironmentArg emptyEnvironment() { return 
 EnvironmentArg(null, false);}
 EnvironmentArg parentEnvironment() { return 
 EnvironmentArg(null, true);}

 Like? Hate?
Hmm.. what if spawnProcess() takes a normal string[string] like it does now, but we add a flag to Config that determines whether it is merged with the parent's environment or not? string[string] myEnv = [ "foo" : "bar" ]; spawnProcess("helloworld", null); // Parent's env spawnProcess("helloworld", myEnv); // Parent's env + myEnv spawnProcess("helloworld", null, ..., Config.clearEnv); // Empty env spawnProcess("helloworld", myEnv, ..., Config.clearEnv); // Only myEnv
The more I think about this, the more it seems like a good idea: 1. A string[string] parameter for the environment, which defaults to null, and for which there is no difference between null and empty -- they are both empty. 2. A Config flag that determines whether the given AA should be merged with the parent's environment or not, with the former being the default. 3. Variables in the AA always override variables from the parent's environment. 4. The two spawnProcess() overloads that do *not* take an environment parameter will be removed. 5. Instead, we add two overloads without the redirection parameters, since the Config parameter will probably be used more often now: spawnProcess(string prog, string[string] env, Config conf); spawnProcess(string[] args, string[string] env, Config conf); Looks good? Lars
Mar 14 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Mar 2013 16:20:24 -0400, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 The more I think about this, the more it seems like a good idea:

 1. A string[string] parameter for the environment, which defaults
 to null, and for which there is no difference between null and
 empty -- they are both empty.

 2. A Config flag that determines whether the given AA should be
 merged with the parent's environment or not, with the former
 being the default.

 3. Variables in the AA always override variables from the
 parent's environment.

 4. The two spawnProcess() overloads that do *not* take an
 environment parameter will be removed.

 5. Instead, we add two overloads without the redirection
 parameters, since the Config parameter will probably be used more
 often now:

     spawnProcess(string prog, string[string] env, Config conf);
     spawnProcess(string[] args, string[string] env, Config conf);

 Looks good?
Looks good. Part of me thinks you shouldn't have to specify environment in order to specify redirects, but I don't know how that works with the overloads. I know File is a struct, so it shouldn't bind to null, right? By "AA should be merged with the parent's environment or not, with the former being the default", I'm assuming the "set" flag will mean "don't inherit". What name do you have in mind? Since we already have dontDoTheCloseThing, I think dontDoTheEnvironmentInheritThing would be good ;) I know it's bikeshedding, but negative flags are awful, we should come up with positive ones. ignoreParentEnv? -Steve
Mar 14 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Thursday, 14 March 2013 at 20:34:11 UTC, Steven Schveighoffer 
wrote:
 On Thu, 14 Mar 2013 16:20:24 -0400, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 The more I think about this, the more it seems like a good 
 idea:

 [...]

 Looks good?
Looks good. Part of me thinks you shouldn't have to specify environment in order to specify redirects, but I don't know how that works with the overloads. I know File is a struct, so it shouldn't bind to null, right?
No, there won't be any problem with adding overloads with and without environment, but then there'll be six spawnProcess() versions. We have to weigh that against the user having to explicitly specify a null when they don't want to add to the environment, but still want redirects. I don't know which is worse. I don't think this is too bad, though: auto p = spawnProcess("myapp", null, File("input.txt"));
 By "AA should be merged with the parent's environment or not, 
 with the former being the default", I'm assuming the "set" flag 
 will mean "don't inherit".  What name do you have in mind?  
 Since we already have dontDoTheCloseThing, I think 
 dontDoTheEnvironmentInheritThing would be good ;)

 I know it's bikeshedding, but negative flags are awful, we 
 should come up with positive ones.  ignoreParentEnv?
Now that the big pieces are seemingly falling into place, it is probably time for bikeshedding. I was thinking clearEnv or newEnv, but ignoreParentEnv is perhaps more explicit. Speaking of negative flags, do you have better suggestions for the Config.noCloseStd... ones? dontDoTheCloseThing became inheritFDs, btw. :) Also open for suggestions on that one. Lars
Mar 14 2013
next sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Thursday, 14 March 2013 at 21:51:37 UTC, Lars T. Kyllingstad 
wrote:
 On Thursday, 14 March 2013 at 20:34:11 UTC, Steven 
 Schveighoffer wrote:
 Part of me thinks you shouldn't have to specify environment in 
 order to specify redirects, but I don't know how that works 
 with the overloads.  I know File is a struct, so it shouldn't 
 bind to null, right?
No, there won't be any problem with adding overloads with and without environment, but then there'll be six spawnProcess() versions. We have to weigh that against the user having to explicitly specify a null when they don't want to add to the environment, but still want redirects. [...]
We could switch them around, though, and put the environment after the redirects: spawnProcess(args, stdin, stdout, stderr, env, config) spawnProcess(args, env, config) spawnProcess(prog, stdin, stdout, stderr, env, config) spawnProcess(prog, env, config) I didn't specify any default values there, but every parameter would have a default except the first one. Lars
Mar 14 2013
prev sibling next sibling parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Thu, 14 Mar 2013 22:51:36 +0100
schrieb "Lars T. Kyllingstad" <public kyllingen.net>:

 Now that the big pieces are seemingly falling into place, it is 
 probably time for bikeshedding.  I was thinking clearEnv or 
 newEnv, but ignoreParentEnv is perhaps more explicit.
I think clearEnv is pretty clear already. Someone should put up an online poll for that.
 dontDoTheCloseThing became inheritFDs, btw. :)  Also open for 
 suggestions on that one.
Looks like a mix of Windows and Unix terminology. :) closeHandlesOnExec anyone? Am I right assuming that on Windows it will just not set the bInheritHandles flag? -- Marco
Mar 14 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Friday, 15 March 2013 at 00:36:59 UTC, Marco Leise wrote:
 Am Thu, 14 Mar 2013 22:51:36 +0100
 schrieb "Lars T. Kyllingstad" <public kyllingen.net>:

 Now that the big pieces are seemingly falling into place, it 
 is probably time for bikeshedding.  I was thinking clearEnv or 
 newEnv, but ignoreParentEnv is perhaps more explicit.
I think clearEnv is pretty clear already. Someone should put up an online poll for that.
Bikeshedding, yes, but I don't think we've quite reached the point where we need polls yet. :)
 dontDoTheCloseThing became inheritFDs, btw. :)  Also open for 
 suggestions on that one.
Looks like a mix of Windows and Unix terminology. :) closeHandlesOnExec anyone? Am I right assuming that on Windows it will just not set the bInheritHandles flag?
You are not entirely right. The flag is used only by POSIX code, and causes spawnProcess() to *not* close all open file descriptors. On Windows, inheritFDs has no effect whatsoever. closeHandlesOnExec looks even more like a mix to me. I can't recall ever seeing the word "handle" used in *NIX documentation. "File descriptor" is used pretty consistently. I also tried to avoid using the words "close on exec" in the name, because I didn't want users to think this has anything to do with the FD_CLOEXEC flag. Lars
Mar 14 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Mar 2013 17:51:36 -0400, Lars T. Kyllingstad  
<public kyllingen.net> wrote:

 On Thursday, 14 March 2013 at 20:34:11 UTC, Steven Schveighoffer wrote:
 On Thu, 14 Mar 2013 16:20:24 -0400, Lars T. Kyllingstad  
 <public kyllingen.net> wrote:

 The more I think about this, the more it seems like a good idea:

 [...]

 Looks good?
Looks good. Part of me thinks you shouldn't have to specify environment in order to specify redirects, but I don't know how that works with the overloads. I know File is a struct, so it shouldn't bind to null, right?
No, there won't be any problem with adding overloads with and without environment, but then there'll be six spawnProcess() versions. We have to weigh that against the user having to explicitly specify a null when they don't want to add to the environment, but still want redirects. I don't know which is worse. I don't think this is too bad, though: auto p = spawnProcess("myapp", null, File("input.txt"));
OK, that is fine with me.
 By "AA should be merged with the parent's environment or not, with the  
 former being the default", I'm assuming the "set" flag will mean "don't  
 inherit".  What name do you have in mind?  Since we already have  
 dontDoTheCloseThing, I think dontDoTheEnvironmentInheritThing would be  
 good ;)

 I know it's bikeshedding, but negative flags are awful, we should come  
 up with positive ones.  ignoreParentEnv?
Now that the big pieces are seemingly falling into place, it is probably time for bikeshedding. I was thinking clearEnv or newEnv, but ignoreParentEnv is perhaps more explicit.
clearEnv is somewhat incorrect, since you could specify a new environment via the env parameter, it won't be clear. newEnv sounds ok, and probably is better than ignoreParentEnv to avoid a more verbose name, even if less descriptive. At the very least, it will make people look it up ;)
 Speaking of negative flags, do you have better suggestions for the  
 Config.noCloseStd... ones?
retainStdout
 dontDoTheCloseThing became inheritFDs, btw. :)  Also open for  
 suggestions on that one.
That is fine with me. -Steve
Mar 15 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Friday, 15 March 2013 at 13:57:10 UTC, Steven Schveighoffer 
wrote:
 On Thu, 14 Mar 2013 17:51:36 -0400, Lars T. Kyllingstad 
 <public kyllingen.net> wrote:

 Speaking of negative flags, do you have better suggestions for 
 the Config.noCloseStd... ones?
retainStdout
Nice!
Mar 15 2013
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 13 Mar 2013 02:45:57 -0400, dennis luehring <dl.soluz gmx.net>
wrote:

 why not differentiate on callsite?

 like

 environment_usage =
 {
    PARENT_ENVIRONMENT,
    NONE_ENVIRONMENT, // which differs from empty given environment
    GIVEN_ENVIRONMENT
 }

 spawnProcess(process,parameter,environment_usage = PARENT_ENVIRONMENT,  
 environemnt = null)

 it feels very wrong to put the environment "usage" type in any way into  
 the environment-abstraction itself (by occupying null or empty...)

 +some nice helpers

 spawnProcessWithParentEnvironment((process,parameter)
 spawnProcessWithoutEnvironment((process,parameter)
 spawnProcessWithEnvironment((process,parameter,environment=...)

 woulnd't that be much clearer?

 the other way could be an

 spawnProcess(process,parameter,environment=use_parent_environment());
 with parent-environment selector

 spawnProcess(process,parameter,environment=given_environment(environment));
Hm.. I think I actually like this. I hate to have feature creep at this point, but one kind of annoying thing is, if you want to *add* to the current environment, it is a multi-step process: auto curenv = environment.toAA; curenv["x"] = "y"; spawnProcess("helloworld", curenv); But with something similar to Dennis' idea, we have a possible way to do that without making a copy of the current environment into an AA and adding: struct EnvironmentArg { this(string[string] env, bool useParent=false) { this.env = env; this.useParent = useParent;} this(bool useParent) {this.useParent = useParent;} string[string] env; bool useParent; } spawnProcess("helloworld", EnvironmentArg(["x":"y"], true)); // use parent environment, add x=y spawnProcess("helloworld", EnvironmentArg(["x":"y"])); // replace environment with x=y spawnProcess("helloworld", EnvironmentArg(false)); // use empty environment spawnProcess("helloworld", EnvironmentArg(true)); // use parent environment exactly EnvironmentArg should probably have better name, and I would recommend some global functions that make common things, like: EnvironmentArg emptyEnvironment() { return EnvironmentArg(null, false);} EnvironmentArg parentEnvironment() { return EnvironmentArg(null, true);} Like? Hate? -Steve
Mar 13 2013
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wednesday, 13 March 2013 at 06:31:57 UTC, Lars T. Kyllingstad 
wrote:
 That's a lot better than ["____":"_____"], at least. :)

 But still, the difference between a null AA and an empty AA is 
 still very subtle, and I am hesitant to design an API that 
 depends on it.  We'd have to explain to the users that "ok, so 
 there are two kinds of empty AAs: the ones you've done nothing 
 with, and the ones you've added and removed a value from..."

 Furthermore, the language spec doesn't seem to mention "null" 
 in relation to AAs.  Shouldn't the difference between null and 
 empty then be treated as an implementation detail?  Can we even 
 be sure that "aa is null" will work in two years?
It doesn't need to be part of the API. Just treat emptyEnvironment as a magic value that means to pass an empty environment. If the language's AA semantics change, the implementation can return an arbitrary AA literal from emptyEnvironment, and spawnProcess would do a "if (environment is emptyEnvironment)" check. Making use of the current distinction between empty and null AAs is useful as it simplifies the implementation, and I think is marginally better than a magic (non-empty) value.
Mar 13 2013
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 12 March 2013 at 21:39:47 UTC, Steven Schveighoffer 
wrote:
 I think all are in agreement at this point that closing the 
 files between fork and exec is a good solution.  Whether or not 
 to ALSO set F_CLOEXEC wherever Phobos opens a file descriptor 
 is an additional matter.
I have a created a pull request for a sys/resource.h translation now. It contains getrlimit(), which we can use to get the max. number of file descriptors. https://github.com/D-Programming-Language/druntime/pull/445 Lars
Mar 13 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 5 March 2013 at 21:04:15 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 5 March 2013 at 20:19:06 UTC, Lars T. Kyllingstad 
 wrote:
 A special thanks to Vladimir P. for pointing out an egregious 
 flaw in the original design.
But wait, there's more!
Aw, man....
 (please don't hurt me)

 1. Typo: "plattform"
That would be my native language shining through. :)
 2. Is there any meaning in the idea of consolidating 
 spawnProcess/pipeProcess/execute and 
 spawnShell/pipeShell/shell? How about that collectOutput idea?
In principle, I like that idea. In fact, you'll see that execute() and shell() are now both implemented using a function similar to (and inspired by) the collectOutput() method you suggested. Furthermore, pipeProcess() and pipeShell() both forward to a pipeProcessImpl() function which takes a spawn function as a template parameter. I'm not sure if this is the API we want to expose to the user, though. Firstly, auto r = execute("foo"); is a lot easier on the eye than auto r = pipeProcess("foo", Redirect.stdout | Redirect.stderrToStdout) .collectOutput(); Secondly, I only think a collectOutput() method would be appropriate to use if one of the output streams is redirected into the other. Consider this: auto r = pipeProcess("foo").collectOutput(); Now, the output and error streams are redirected into separate pipes. But what if "foo" starts off by writing 1 MB of data to its error stream? Maybe it could be solved by some intelligent behaviour on the part of collectOutput(), based on the redirect flags, but I think it is better to encapsulate pipe creation AND reading in one function, as is currently done with execute() and shell(). pipeProcess(), on the other hand, that is another matter. I wonder if pipeProcessImpl() would be a good public interface (with a different name, of course)?
 3. Where are we with compatibility with the old module? One 
 idea I haven't seen mentioned yet is: perhaps we could make the 
 return value of "shell" have a deprecated "alias this" to the 
 output string, so that it's implicitly convertible to a string 
 to preserve compatibility.
If that works in all cases, I think it is a fantastic idea! There is still the issue of the old shell() throwing when the process exits with a nonzero status, though. Maybe I'll just have to bite the bullet and accept a different name. :( It really seems to be the one thing that is preventing the two modules from being combined. Suggestions, anyone?
 4. Is there any way to deal with pipe clogging (pipe buffer 
 getting exceeded when manually handling both input and output 
 of a subprocess)? Can we query the number of bytes we can 
 immediately read/write without blocking on a File?
I've wondered about that myself. I don't know whether this is a problem std.process should aim to solve in any way, or if it should be treated as a general problem with File. It is a real problem, though. Pipe buffers are surprisingly small.
 5. How about that Environment.opIn_r?
Forgot about it. :) I'll add it. Lars
Mar 05 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-03-06 08:27, Lars T. Kyllingstad wrote:

 That would be my native language shining through. :)
Doesn't everyone have a have a spell checker in their editors :) -- /Jacob Carlborg
Mar 06 2013
prev sibling next sibling parent "yaz" <yazan.dabain gmail.com> writes:
How about std.os.process or std.system.process as names?
Mar 06 2013
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wednesday, 6 March 2013 at 07:27:19 UTC, Lars T. Kyllingstad 
wrote:
 In principle, I like that idea.  In fact, you'll see that 
 execute() and shell() are now both implemented using a function 
 similar to (and inspired by) the collectOutput() method you 
 suggested.  Furthermore, pipeProcess() and pipeShell() both 
 forward to a pipeProcessImpl() function which takes a spawn 
 function as a template parameter.
OK, this sounds reasonable. It's just that it's easy to get a little overwhelmed by the number of various functions at first, and we've seen some confusion regarding them already. Could we add a 2-by-3 table at the top of the module, to visualize how the various function flavors relate to each other?
 Now, the output and error streams are redirected into separate 
 pipes.  But what if "foo" starts off by writing 1 MB of data to 
 its error stream?
What's the problem here? If the goal is to collect both stdout and stderr, and the problem is pipe clogging, we should try to solve that. In fact, if we do come up with a correct collectOutput implementation, it would likely be useful to make the function public. It would be especially useful if the function could also correctly feed the subprocess input from a buffer (string), which could be passed as an optional parameter to collectOutput.
 Maybe I'll just have to bite the bullet and accept a different 
 name. :(  It really seems to be the one thing that is 
 preventing the two modules from being combined.  Suggestions, 
 anyone?
runShell? executeShell?
Mar 06 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Wednesday, 6 March 2013 at 07:27:19 UTC, Lars T. Kyllingstad 
wrote:
 On Tuesday, 5 March 2013 at 21:04:15 UTC, Vladimir Panteleev 
 wrote:
 5. How about that Environment.opIn_r?
Forgot about it. :) I'll add it.
So I sat down to write this function, but then I reconsidered. The thing is, checking whether the variable exists is exactly the same operation as retrieving it. In other words, this: if (key in environment) { auto val = environment[key]; ... } is equivalent to: if (environment.get(key) !is null) { auto val = environment.get(key); ... } That just seems... wrong, somehow. Instead, I think we should encourage code like this: auto val = environment.get(key); if (val !is null) { ... } or, even better, IMO: if (auto val = environment.get(key)) { ... } But feel free to convince me otherwise, you've had great success with that so far. :) Lars
Mar 12 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 07:41:03 UTC, Lars T. Kyllingstad 
wrote:
 On Wednesday, 6 March 2013 at 07:27:19 UTC, Lars T. Kyllingstad 
 wrote:
 On Tuesday, 5 March 2013 at 21:04:15 UTC, Vladimir Panteleev 
 wrote:
 5. How about that Environment.opIn_r?
Forgot about it. :) I'll add it.
So I sat down to write this function, but then I reconsidered. The thing is, checking whether the variable exists is exactly the same operation as retrieving it. In other words, this: if (key in environment) { auto val = environment[key]; ... } is equivalent to: if (environment.get(key) !is null) { auto val = environment.get(key); ... }
Yes, it's just syntax sugar, and an operation supported by AAs (which environment imitates). It's useful if you don't want to retrieve the value of a variable right after checking if it exists - you just want to see if it's there or not.
Mar 12 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 12 March 2013 at 08:28:02 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 12 March 2013 at 07:41:03 UTC, Lars T. Kyllingstad 
 wrote:
 On Wednesday, 6 March 2013 at 07:27:19 UTC, Lars T. 
 Kyllingstad wrote:
 On Tuesday, 5 March 2013 at 21:04:15 UTC, Vladimir Panteleev 
 wrote:
 5. How about that Environment.opIn_r?
Forgot about it. :) I'll add it.
So I sat down to write this function, but then I reconsidered. The thing is, checking whether the variable exists is exactly the same operation as retrieving it. In other words, this: if (key in environment) { auto val = environment[key]; ... } is equivalent to: if (environment.get(key) !is null) { auto val = environment.get(key); ... }
Yes, it's just syntax sugar, and an operation supported by AAs (which environment imitates). It's useful if you don't want to retrieve the value of a variable right after checking if it exists - you just want to see if it's there or not.
For AAs, 'in' returns a pointer to the element, which is null if the element does not exist. I can't think of a good way to implement this. Since we have to convert the raw environment variable to a D string anyways, we'd have to do something like: string* opIn_r(string var) { auto val = get(var); if (val is null) return null; else return [val].ptr; } but that seems rather pointless to me. Lars
Mar 12 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 08:58:13 UTC, Lars T. Kyllingstad 
wrote:
 For AAs, 'in' returns a pointer to the element, which is null 
 if the element does not exist.  I can't think of a good way to 
 implement this.  Since we have to convert the raw environment 
 variable to a D string anyways, we'd have to do something like:

   string* opIn_r(string var)
   {
       auto val = get(var);
       if (val is null) return null;
       else return [val].ptr;
   }

 but that seems rather pointless to me.
Yes. That use of the "is" operator is mainly to allow updating the value without looking up the key twice. This behavior could be implemented using a proxy object, but this is not what I was talking about. I meant the specific case of "if (key in environment)".
Mar 12 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 05:13:59 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 12 March 2013 at 08:58:13 UTC, Lars T. Kyllingstad wrote:
 For AAs, 'in' returns a pointer to the element, which is null if the  
 element does not exist.  I can't think of a good way to implement  
 this.  Since we have to convert the raw environment variable to a D  
 string anyways, we'd have to do something like:

   string* opIn_r(string var)
   {
       auto val = get(var);
       if (val is null) return null;
       else return [val].ptr;
   }

 but that seems rather pointless to me.
Yes. That use of the "is" operator is mainly to allow updating the value
you meant "in", not "is", right?
 without looking up the key twice. This behavior could be implemented  
 using a proxy object, but this is not what I was talking about. I meant  
 the specific case of "if (key in environment)".
I think Valdimir wants to have opIn_r return bool? -Steve
Mar 12 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12-Mar-2013 18:09, Steven Schveighoffer пишет:
 On Tue, 12 Mar 2013 05:13:59 -0400, Vladimir Panteleev
 <vladimir thecybershadow.net> wrote:

 On Tuesday, 12 March 2013 at 08:58:13 UTC, Lars T. Kyllingstad wrote:
 For AAs, 'in' returns a pointer to the element, which is null if the
 element does not exist.  I can't think of a good way to implement
 this.  Since we have to convert the raw environment variable to a D
 string anyways, we'd have to do something like:

   string* opIn_r(string var)
   {
       auto val = get(var);
       if (val is null) return null;
       else return [val].ptr;
   }

 but that seems rather pointless to me.
Yes. That use of the "is" operator is mainly to allow updating the value
you meant "in", not "is", right?
 without looking up the key twice. This behavior could be implemented
 using a proxy object, but this is not what I was talking about. I
 meant the specific case of "if (key in environment)".
I think Valdimir wants to have opIn_r return bool?
What's wrong with adding an actual AA (inside) as a write-through cache for environment variables? -- Dmitry Olshansky
Mar 12 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 12 March 2013 at 14:17:37 UTC, Dmitry Olshansky wrote:
 What's wrong with adding an actual AA (inside) as a 
 write-through cache for environment variables?
That the environment variable may change after we've cached it. Lars
Mar 12 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 14:09:55 UTC, Steven Schveighoffer 
wrote:
 Yes. That use of the "is" operator is mainly to allow updating 
 the value
you meant "in", not "is", right?
Yes. Sorry, the keys are right next to each other :)
 without looking up the key twice. This behavior could be 
 implemented using a proxy object, but this is not what I was 
 talking about. I meant the specific case of "if (key in 
 environment)".
I think Valdimir wants to have opIn_r return bool?
Returning the string (doing the same as ".get(key, null)") should have the same effect in an if statement.
Mar 12 2013
next sibling parent "simendsjo" <simendsjo gmail.com> writes:
On Tuesday, 12 March 2013 at 14:47:53 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 12 March 2013 at 14:09:55 UTC, Steven Schveighoffer 
 wrote:
(...)
 you meant "in", not "is", right?
Yes. Sorry, the keys are right next to each other :)
Go dvorak!
Mar 12 2013
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 12 Mar 2013 10:47:52 -0400, Vladimir Panteleev  
<vladimir thecybershadow.net> wrote:

 On Tuesday, 12 March 2013 at 14:09:55 UTC, Steven Schveighoffer wrote:
 Yes. That use of the "is" operator is mainly to allow updating the  
 value
you meant "in", not "is", right?
Yes. Sorry, the keys are right next to each other :)
 without looking up the key twice. This behavior could be implemented  
 using a proxy object, but this is not what I was talking about. I  
 meant the specific case of "if (key in environment)".
I think Valdimir wants to have opIn_r return bool?
Returning the string (doing the same as ".get(key, null)") should have the same effect in an if statement.
Yes that is true. Why doesn't .get work for your case again? environment.get(key) vs. key in environment Doesn't seem that different to me... I suppose an opIn_r alias is not difficult to add, if it's just for syntax sugar. -Steve
Mar 12 2013
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 12 March 2013 at 14:58:21 UTC, Steven Schveighoffer 
wrote:
 Yes that is true.  Why doesn't .get work for your case again?

 environment.get(key)
 vs.
 key in environment

 Doesn't seem that different to me...

 I suppose an opIn_r alias is not difficult to add, if it's just 
 for syntax sugar.
Yes, exactly. :) (We've seriously overblown this...)
Mar 12 2013
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
Sorry for the delay, but I've pushed a new version now.  There 
are still a few things I haven't done wrt. documentation* and 
unittests**, but the changes to the API and internals should be 
in place.

The biggest changes are that spawnProcess() now closes all 
non-std file descriptors on POSIX systems, and that it handles 
environment variables differently.  Specifically, they are now 
*merged* with the parent's environment by default, rather than 
replacing it.  In addition, there are some things that have been 
renamed, bugs that have been fixed, etc.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
* I'm going to add a function table to the module introduction and flesh out the docs for some of the functions a bit. ** spawnShell() currently has no unittest, and I haven't enabled the "burn-in" test for Vladimir's escape*() functions yet.
Mar 20 2013
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Wednesday, 20 March 2013 at 22:46:58 UTC, Lars T. Kyllingstad 
wrote:
 Sorry for the delay, but I've pushed a new version now.  There 
 are still a few things I haven't done wrt. documentation* and 
 unittests**, but the changes to the API and internals should be 
 in place.
Since (IIRC) all issues regarding incompatibility with std.process have been resolved, how about renaming the module to std.process? This way it'll also be easier to test backwards-compatibility in existing programs.
Mar 21 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Mar 21, 2013 at 12:08:00PM +0100, Vladimir Panteleev wrote:
 On Wednesday, 20 March 2013 at 22:46:58 UTC, Lars T. Kyllingstad
 wrote:
Sorry for the delay, but I've pushed a new version now.  There are
still a few things I haven't done wrt. documentation* and
unittests**, but the changes to the API and internals should be in
place.
Since (IIRC) all issues regarding incompatibility with std.process have been resolved, how about renaming the module to std.process? This way it'll also be easier to test backwards-compatibility in existing programs.
+1. I hate std.process2 with a passion. Let's keep it as std.process. T -- The computer is only a tool. Unfortunately, so is the user. -- Armaphine, K5
Mar 21 2013
next sibling parent 1100110 <0b1100110 gmail.com> writes:
On 03/21/2013 11:35 AM, H. S. Teoh wrote:
 On Thu, Mar 21, 2013 at 12:08:00PM +0100, Vladimir Panteleev wrote:
 On Wednesday, 20 March 2013 at 22:46:58 UTC, Lars T. Kyllingstad
 wrote:
 Sorry for the delay, but I've pushed a new version now.  There are
 still a few things I haven't done wrt. documentation* and
 unittests**, but the changes to the API and internals should be in
 place.
Since (IIRC) all issues regarding incompatibility with std.process have been resolved, how about renaming the module to std.process? This way it'll also be easier to test backwards-compatibility in existing programs.
+1. I hate std.process2 with a passion. Let's keep it as std.process. T
++1 No more incrementing names!
Mar 21 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Thursday, 21 March 2013 at 16:37:38 UTC, H. S. Teoh wrote:
 On Thu, Mar 21, 2013 at 12:08:00PM +0100, Vladimir Panteleev 
 wrote:
 
 Since (IIRC) all issues regarding incompatibility with 
 std.process
 have been resolved, how about renaming the module to 
 std.process?
 This way it'll also be easier to test backwards-compatibility 
 in
 existing programs.
+1. I hate std.process2 with a passion. Let's keep it as std.process.
The main reason I created a separate std.process2 was in fact not that I intended to keep it that way, but because I kept getting merge conflicts whenever I merged in Phobos master. If you all don't mind, I'd like to keep it separate until we are satisfied that the API is stable. Lars
Mar 21 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Mar 21, 2013 at 06:32:59PM +0100, Lars T. Kyllingstad wrote:
 On Thursday, 21 March 2013 at 16:37:38 UTC, H. S. Teoh wrote:
On Thu, Mar 21, 2013 at 12:08:00PM +0100, Vladimir Panteleev
wrote:
Since (IIRC) all issues regarding incompatibility with std.process
have been resolved, how about renaming the module to std.process?
This way it'll also be easier to test backwards-compatibility in
existing programs.
+1. I hate std.process2 with a passion. Let's keep it as std.process.
The main reason I created a separate std.process2 was in fact not that I intended to keep it that way, but because I kept getting merge conflicts whenever I merged in Phobos master. If you all don't mind, I'd like to keep it separate until we are satisfied that the API is stable.
[...] That's fine, as long as it's renamed to std.process once it's merged. T -- People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG
Mar 21 2013
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-02-23 12:31, Lars T. Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :) The upshot
 is that the module has actually seen active use over those years, both
 by yours truly and others, so hopefully the worst wrinkles are already
 ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
I would like a function for getting the current process path, i.e. this: http://dsource.org/projects/tango/attachment/ticket/1536/process.d -- /Jacob Carlborg
Mar 25 2013
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 25 Mar 2013 15:17:14 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2013-02-23 12:31, Lars T. Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :) The upshot
 is that the module has actually seen active use over those years, both
 by yours truly and others, so hopefully the worst wrinkles are already
 ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
I would like a function for getting the current process path, i.e. this: http://dsource.org/projects/tango/attachment/ticket/1536/process.d
This is orthogonal to the replacement of process creation functions, can we add this as an enhancement later? It definitely is useful, but std.process sucks right now. I don't want to delay the replacement of existing functions with adding extra features. -Steve
Mar 25 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-03-25 20:27, Steven Schveighoffer wrote:

 This is orthogonal to the replacement of process creation functions, can
 we add this as an enhancement later?  It definitely is useful, but
 std.process sucks right now.  I don't want to delay the replacement of
 existing functions with adding extra features.
Created a pull request for it: https://github.com/D-Programming-Language/phobos/pull/1224 -- /Jacob Carlborg
Mar 25 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 March 2013 at 21:09:57 UTC, Jacob Carlborg wrote:
 On 2013-03-25 20:27, Steven Schveighoffer wrote:

 This is orthogonal to the replacement of process creation 
 functions, can
 we add this as an enhancement later?  It definitely is useful, 
 but
 std.process sucks right now.  I don't want to delay the 
 replacement of
 existing functions with adding extra features.
Created a pull request for it: https://github.com/D-Programming-Language/phobos/pull/1224
If this gets added *before* the new std.process, we should at least agree on a consistent naming convention. The new std.process has property thisProcessID(), which replaces getpid(), and which is somewhat consistent with std.concurrency.thisTid. I also think std.file.getcwd() is part of this function family, and that it should be moved to std.process (under a different name). Personally, I dislike function names that start with "get". Lars
Mar 26 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-03-26 14:07, Lars T. Kyllingstad wrote:

 If this gets added *before* the new std.process, we should at least
 agree on a consistent naming convention.  The new std.process has
  property thisProcessID(), which replaces getpid(), and which is
 somewhat consistent with std.concurrency.thisTid.  I also think
 std.file.getcwd() is part of this function family, and that it should be
 moved to std.process (under a different name).

 Personally, I dislike function names that start with "get".
so what would you prefer, "thisProcessPath"? Or thisExecutablePath, perhaps? -- /Jacob Carlborg
Mar 26 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Tuesday, 26 March 2013 at 14:10:56 UTC, Jacob Carlborg wrote:
 On 2013-03-26 14:07, Lars T. Kyllingstad wrote:

 If this gets added *before* the new std.process, we should at 
 least
 agree on a consistent naming convention.  The new std.process 
 has
  property thisProcessID(), which replaces getpid(), and which 
 is
 somewhat consistent with std.concurrency.thisTid.  I also think
 std.file.getcwd() is part of this function family, and that it 
 should be
 moved to std.process (under a different name).

 Personally, I dislike function names that start with "get".
so what would you prefer, "thisProcessPath"? Or thisExecutablePath, perhaps?
Something like that, with property. thisProcessExecutable is also an option, but it's rather long. I think I prefer thisExecutablePath. Lars
Mar 26 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-03-26 15:42, Lars T. Kyllingstad wrote:

 Something like that, with  property.  thisProcessExecutable is also an
 option, but it's rather long.  I think I prefer thisExecutablePath.
I like thisExecutablePath better than thisProcessExecutable. -- /Jacob Carlborg
Mar 26 2013
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Monday, 25 March 2013 at 19:27:31 UTC, Steven Schveighoffer 
wrote:
 On Mon, 25 Mar 2013 15:17:14 -0400, Jacob Carlborg 
 <doob me.com> wrote:

 On 2013-02-23 12:31, Lars T. Kyllingstad wrote:
 It's been years in the coming, but we finally got it done. :) 
 The upshot
 is that the module has actually seen active use over those 
 years, both
 by yours truly and others, so hopefully the worst wrinkles 
 are already
 ironed out.

 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151

 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d

 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
I would like a function for getting the current process path, i.e. this: http://dsource.org/projects/tango/attachment/ticket/1536/process.d
This is orthogonal to the replacement of process creation functions, can we add this as an enhancement later? It definitely is useful, but std.process sucks right now. I don't want to delay the replacement of existing functions with adding extra features.
Speaking of not delaying any more, could we get the "official" review going, and perhaps make it rather brief? After all, the module has effectively been under review in this forum for several weeks already... Does anyone want to volunteer as review manager? Lars
Mar 26 2013
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sat, 23 Feb 2013 12:31:19 +0100
schrieb "Lars T. Kyllingstad" <public kyllingen.net>:

 It's been years in the coming, but we finally got it done. :)  
 The upshot is that the module has actually seen active use over 
 those years, both by yours truly and others, so hopefully the 
 worst wrinkles are already ironed out.
 
 Pull request:
 https://github.com/D-Programming-Language/phobos/pull/1151
 
 Code:
 https://github.com/kyllingstad/phobos/blob/std-process2/std/process2.d
 
 Documentation:
 http://www.kyllingen.net/code/std-process2/phobos-prerelease/std_process2.html
 
 I hope we can get it reviewed in time for the next release.  (The 
 wiki page indicates that both std.benchmark and std.uni are 
 currently being reviewed, but I fail to find any "official" 
 review threads on the forum.  Is the wiki just out of date?)
 
 Lars
Reposted from github: I think it would be nice if the high level functions would also allow using custom environment variables. So the execute and executeShell functions should have overloads which accept a string[string] with environment variables (and probably accept a Config as well). The execute and executeShell functions would be trivial to implement though if pipeProcess / pipeShell had an overload with environment and Config parameters. So it's probably more important that pipeProcess / pipeShell get these overloads.
Mar 31 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 31 March 2013 at 13:14:52 UTC, Johannes Pfau wrote:
 Reposted from github:
 I think it would be nice if the high level functions would also 
 allow
 using custom environment variables. So the execute and 
 executeShell
 functions should have overloads which accept a string[string] 
 with
 environment variables (and probably accept a Config as well).

 The execute and executeShell functions would be trivial to 
 implement
 though if pipeProcess / pipeShell had an overload with 
 environment
 and Config parameters. So it's probably more important that
 pipeProcess / pipeShell get these overloads.
Implementation-wise, it is a simple task to add this functionality to both pipeProcess/pipeShell and execute/executeShell. It comes at the cost of more complex function signatures for high-level functions which have, so far, intentionally been kept rather simple: ProcessPipes pipeProcess( string[] args, Redirect redirectFlags = Redirect.all, string[string] env = null, Config config = Config.none); Tuple execute( string[] args, string[string] env = null, Config config = Config.none); But maybe that's not so bad? Lars
Mar 31 2013
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
<rant>
While working on std.process, I have developed a deep and intense 
loathing for Windows process handling in particular, and the 
Win32 API in general.

On Windows XP, what do you think will happen when you run the 
following program?

     int main()
     {
         if (TerminateProcess(INVALID_HANDLE_VALUE, 123))
         {
             writeln("TerminateProcess succeeded, but it shouldn't 
have...");
             return 1;
         }
         else
         {
             writeln("TerminateProcess failed, as it should.");
             return 0;
         }
     }

As you may already have guessed, it does *not* print 
"TerminateProcess failed" and exit with code 0.

But does it print "TerminateProcess succeeded" and exit with code 
1? NO! It prints NOTHING and exits with code 123, because 
TerminateProcess() terminates the CURRENT process when it is 
passed INVALID_HANDLE_VALUE.  Aaaaargh!
</rant>

Sorry, just had to get that off my chest.  I just spent quite 
some time trying to figure out why the Win32 unittests were 
failing when there was no assert error or any other indication of 
what went wrong.
Mar 31 2013
next sibling parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Sun, Mar 31, 2013 at 7:21 PM, Lars T. Kyllingstad
<public kyllingen.net>wrote:

 <rant>
 While working on std.process, I have developed a deep and intense loathing
 for Windows process handling in particular, and the Win32 API in general.

 On Windows XP, what do you think will happen when you run the following
 program?

     int main()
     {
         if (TerminateProcess(INVALID_**HANDLE_VALUE, 123))
         {
             writeln("TerminateProcess succeeded, but it shouldn't
 have...");
             return 1;
         }
         else
         {
             writeln("TerminateProcess failed, as it should.");
             return 0;
         }
     }

 As you may already have guessed, it does *not* print "TerminateProcess
 failed" and exit with code 0.

 But does it print "TerminateProcess succeeded" and exit with code 1? NO!
 It prints NOTHING and exits with code 123, because TerminateProcess()
 terminates the CURRENT process when it is passed INVALID_HANDLE_VALUE.
  Aaaaargh!
 </rant>

 Sorry, just had to get that off my chest.  I just spent quite some time
 trying to figure out why the Win32 unittests were failing when there was no
 assert error or any other indication of what went wrong.
<agreement> I used to hate Windows for similar reasons. I hated it with passion. I dreamed of devoting my life to the goal of putting Microsoft out of busyness to avenge the loss of countless hours of trying to get things to work as they say they should. I ended my misery by buying an Apple MacBook Air and I couldn't be happier. The development environment is a bit unusual (needs some tweaking to become full-fledged posix development environment), but worth the time and money. OS X is the absolute best operating system I ever had to work with. It's system API (cocoa) makes so much more sense (despite being a bit hard to access via C ABI). I really recommend you to ditch that excuse for an operating system and forget it like a bad dream. </agreement> -- Bye, Gor Gyolchanyan.
Mar 31 2013
parent reply "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 31 March 2013 at 15:49:59 UTC, Gor Gyolchanyan wrote:
 I really recommend you to ditch that excuse for an operating 
 system and
 forget it like a bad dream.
I think a few people would be disappointed if the new std.process didn't support Windows. :) Lars
Mar 31 2013
parent Jacob Carlborg <doob me.com> writes:
On 2013-03-31 19:28, Lars T. Kyllingstad wrote:

 I think a few people would be disappointed if the new std.process didn't
 support Windows. :)
Those should follow the same advice :) -- /Jacob Carlborg
Mar 31 2013
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sunday, 31 March 2013 at 15:21:38 UTC, Lars T. Kyllingstad 
wrote:
 As you may already have guessed, it does *not* print 
 "TerminateProcess failed" and exit with code 0.

 But does it print "TerminateProcess succeeded" and exit with 
 code 1? NO! It prints NOTHING and exits with code 123, because 
 TerminateProcess() terminates the CURRENT process when it is 
 passed INVALID_HANDLE_VALUE.  Aaaaargh!
How did INVALID_HANDLE_VALUE get so far in the code as to reach TerminateProcess? Shouldn't an enforce call have been in place to validate whatever the source of the handle is? Anyway, this is documented behavior. You can pass GetCurrentProcess() to TerminateProcess to terminate the current process. Your plight was caused by the unfortunate (or perhaps, unforesighted) coincidence that GetCurrentProcess() returns the special (magic) value of (HANDLE)-1, the same value of INVALID_HANDLE_VALUE.
Mar 31 2013
parent "Lars T. Kyllingstad" <public kyllingen.net> writes:
On Sunday, 31 March 2013 at 20:08:05 UTC, Vladimir Panteleev 
wrote:
 On Sunday, 31 March 2013 at 15:21:38 UTC, Lars T. Kyllingstad 
 wrote:
 As you may already have guessed, it does *not* print 
 "TerminateProcess failed" and exit with code 0.

 But does it print "TerminateProcess succeeded" and exit with 
 code 1? NO! It prints NOTHING and exits with code 123, because 
 TerminateProcess() terminates the CURRENT process when it is 
 passed INVALID_HANDLE_VALUE.  Aaaaargh!
How did INVALID_HANDLE_VALUE get so far in the code as to reach TerminateProcess? Shouldn't an enforce call have been in place to validate whatever the source of the handle is?
I was operating under the (reasonable, IMO) assumption that TerminateProcess would fail if given an invalid handle, and that such a check would therefore be redundant. This certainly seems to be the case on newer Windows versions. As it turns out, INVALID_HANDLE_VALUE is *not* an invalid handle value, and consequently the function does not fail.
 Anyway, this is documented behavior. You can pass 
 GetCurrentProcess() to TerminateProcess to terminate the 
 current process.
This is not specified in the TerminateProcess() documentation, which simply says that hProcess must be "a handle to the process to be terminated". (It is mentioned in a user comment, though.) http://msdn.microsoft.com/en-us/library/windows/desktop/ms686714%28v=vs.85%29.aspx You're right that it *is* specified in the GetCurrentProcess() documentation, but there is no explicit warning that this particular magic value coincides with another magic value.
 Your plight was caused by the unfortunate (or perhaps, 
 unforesighted) coincidence that GetCurrentProcess() returns the 
 special (magic) value of (HANDLE)-1, the same value of 
 INVALID_HANDLE_VALUE.
...which is just incredibly poor design. Lars
Mar 31 2013