www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - getopt: How does arraySep work?

reply Andre Pany <andre s-e-a-p.de> writes:
Hi,

by reading the documentation of std.getopt I would assume, this 
is a valid call

dmd -run sample.d --modelicalibs a b

``` d
import std;

void main(string[] args)
{
     string[] modelicaLibs;
     getopt(args, "modelicalibs", &modelicaLibs);
     assert(modelicaLibs == ["a", "b"]);
}
```

but it fails, as array modelicaLIbs only contains ["a"].

The std.getopt : arraySep documentation hints that it should work:
 The string used to separate the elements of an array or 
 associative array (default is "" which means the elements are 
 separated by whitespace).
Is my understanding wrong, or is this a bug? Kind regards André
Jul 14 2020
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/14/20 7:12 AM, Andre Pany wrote:
 Hi,
 
 by reading the documentation of std.getopt I would assume, this is a 
 valid call
 
 dmd -run sample.d --modelicalibs a b
 
 ``` d
 import std;
 
 void main(string[] args)
 {
      string[] modelicaLibs;
      getopt(args, "modelicalibs", &modelicaLibs);
      assert(modelicaLibs == ["a", "b"]);
 }
 ```
 
 but it fails, as array modelicaLIbs only contains ["a"].
 
 The std.getopt : arraySep documentation hints that it should work:
 The string used to separate the elements of an array or associative 
 array (default is "" which means the elements are separated by 
 whitespace).
Is my understanding wrong, or is this a bug?
The whitespace separator doesn't get to your program. args is: ["sample", "--modelicalibs", "a", "b"] There is no separator in the parameter to --modelicalibs, it's just "a". What you need to do is: dmd -run sample.d --modilicalibs "a b" -Steve
Jul 14 2020
parent Paul Backus <snarwin gmail.com> writes:
On Tuesday, 14 July 2020 at 13:40:44 UTC, Steven Schveighoffer 
wrote:
 The whitespace separator doesn't get to your program. args is:

 ["sample", "--modelicalibs", "a", "b"]

 There is no separator in the parameter to --modelicalibs, it's 
 just "a".

 What you need to do is:

 dmd -run sample.d --modilicalibs "a b"

 -Steve
I thought this was the solution too, but when I actually tried it, I got `modelicaLibs == ["a b"]` and the assertion still failed.
Jul 14 2020
prev sibling parent reply Anonymouse <zorael gmail.com> writes:
On Tuesday, 14 July 2020 at 11:12:06 UTC, Andre Pany wrote:
 [...]
Steven Schveighoffer already answered while I was composing this, so discarding top half. As far as I can tell the default arraySep of "" splitting the argument by whitespace is simply not the case.
 https://github.com/dlang/phobos/blob/master/std/getopt.d#L923
// ... else static if (isArray!(typeof(*receiver))) { // array receiver import std.range : ElementEncodingType; alias E = ElementEncodingType!(typeof(*receiver)); if (arraySep == "") { *receiver ~= to!E(val); } else { foreach (elem; val.splitter(arraySep).map!(a => to!E(a))()) *receiver ~= elem; } } So you will probably want an arraySep of " " if you want --modelicalibs "a b".
Jul 14 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/14/20 9:51 AM, Anonymouse wrote:
 On Tuesday, 14 July 2020 at 11:12:06 UTC, Andre Pany wrote:
 [...]
Steven Schveighoffer already answered while I was composing this, so discarding top half. As far as I can tell the default arraySep of "" splitting the argument by whitespace is simply not the case.
 https://github.com/dlang/phobos/blob/master/std/getopt.d#L923
    // ...     else static if (isArray!(typeof(*receiver)))     {         // array receiver         import std.range : ElementEncodingType;         alias E = ElementEncodingType!(typeof(*receiver));         if (arraySep == "")         {             *receiver ~= to!E(val);         }         else         {             foreach (elem; val.splitter(arraySep).map!(a => to!E(a))())                 *receiver ~= elem;         }     } So you will probably want an arraySep of " " if you want --modelicalibs "a b".
Hm... that looks like it IS actually expecting to do what Andre wants. It's adding each successive parameter. If that doesn't work, then there's something wrong with the logic that decides whether a parameter is part of the previous argument or not. Please file a bug. -Steve
Jul 14 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/14/20 10:05 AM, Steven Schveighoffer wrote:

 Hm... that looks like it IS actually expecting to do what Andre wants. 
 It's adding each successive parameter.
 
 If that doesn't work, then there's something wrong with the logic that 
 decides whether a parameter is part of the previous argument or not.
 
 Please file a bug.
Belay that, the behavior is as designed, I think the issue is the documentation. If arraySep is "", then it's not "separation by whitespace", but rather you must repeat the parameter and each one is appended to the array: dmd -run sample.d --modelicalibs a --modelicalibs b If you want to specify all the parameters in one, you have to provide an arraySep. The documentation needs updating, it should say "parameters are added sequentially" or something like that, instead of "separation by whitespace". -Steve
Jul 14 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/14/20 10:22 AM, Steven Schveighoffer wrote:
 The documentation needs updating, it should say "parameters are added 
 sequentially" or something like that, instead of "separation by 
 whitespace".
https://github.com/dlang/phobos/pull/7557 -Steve
Jul 14 2020
parent reply Andre Pany <andre s-e-a-p.de> writes:
On Tuesday, 14 July 2020 at 14:33:47 UTC, Steven Schveighoffer 
wrote:
 On 7/14/20 10:22 AM, Steven Schveighoffer wrote:
 The documentation needs updating, it should say "parameters 
 are added sequentially" or something like that, instead of 
 "separation by whitespace".
https://github.com/dlang/phobos/pull/7557 -Steve
Thanks for the answer and the pr. Unfortunately my goal here is to simulate a partner tool written in C/C++ which supports this behavior. I will also create an enhancement issue for supporting this behavior. Kind regards Anste
Jul 14 2020
parent reply Andre Pany <andre s-e-a-p.de> writes:
On Tuesday, 14 July 2020 at 15:48:59 UTC, Andre Pany wrote:
 On Tuesday, 14 July 2020 at 14:33:47 UTC, Steven Schveighoffer 
 wrote:
 On 7/14/20 10:22 AM, Steven Schveighoffer wrote:
 The documentation needs updating, it should say "parameters 
 are added sequentially" or something like that, instead of 
 "separation by whitespace".
https://github.com/dlang/phobos/pull/7557 -Steve
Thanks for the answer and the pr. Unfortunately my goal here is to simulate a partner tool written in C/C++ which supports this behavior. I will also create an enhancement issue for supporting this behavior. Kind regards Anste
Enhancement issue: https://issues.dlang.org/show_bug.cgi?id=21045 Kind regards André
Jul 15 2020
parent reply Jon Degenhardt <jond noreply.com> writes:
On Wednesday, 15 July 2020 at 07:12:35 UTC, Andre Pany wrote:
 On Tuesday, 14 July 2020 at 15:48:59 UTC, Andre Pany wrote:
 On Tuesday, 14 July 2020 at 14:33:47 UTC, Steven Schveighoffer 
 wrote:
 On 7/14/20 10:22 AM, Steven Schveighoffer wrote:
 The documentation needs updating, it should say "parameters 
 are added sequentially" or something like that, instead of 
 "separation by whitespace".
https://github.com/dlang/phobos/pull/7557 -Steve
Thanks for the answer and the pr. Unfortunately my goal here is to simulate a partner tool written in C/C++ which supports this behavior. I will also create an enhancement issue for supporting this behavior. Kind regards Anste
Enhancement issue: https://issues.dlang.org/show_bug.cgi?id=21045 Kind regards André
An enhancement is likely to hit some corner-cases involving list termination requiring choices that are not fully generic. Any time a legal list value looks like a legal option. Perhaps the most important case is single digit numeric options like '-1', '-2'. These are legal short form options, and there are programs that use them. They are also somewhat common numeric values to include in command lines inputs. I ran into a couple cases like this with a getopt cover I wrote. The cover supports runtime processing of command arguments in the order entered on the command line rather than the compile-time getopt() call order. Since it was only for my stuff, not Phobos, it was an easy choice: Disallow single digit short options. But a Phobos enhancement might make other choices. IIRC, a characteristic of the current getopt implementation is that it does not have run-time knowledge of all the valid options, so the set of ambiguous entries is larger than just the limited set of options specified in the program. Essentially, anything that looks syntactically like an option. Doesn't mean an enhancement can't be built, just that there might some constraints to be aware of. --Jon
Jul 15 2020
parent reply Andre Pany <andre s-e-a-p.de> writes:
On Thursday, 16 July 2020 at 05:03:36 UTC, Jon Degenhardt wrote:
 On Wednesday, 15 July 2020 at 07:12:35 UTC, Andre Pany wrote:
 [...]
An enhancement is likely to hit some corner-cases involving list termination requiring choices that are not fully generic. Any time a legal list value looks like a legal option. Perhaps the most important case is single digit numeric options like '-1', '-2'. These are legal short form options, and there are programs that use them. They are also somewhat common numeric values to include in command lines inputs. [...]
My naive implementation would be that any dash would stop the list of multiple values. If you want to have a value containing a space or a dash, you enclose it with double quotes in the terminal.
 myapp --modelicalibs "fila-a.mo" "file-b.mo" --log-level info
But you are right there a corner cases to be considered. Kind regards Andre
Jul 16 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/16/20 1:13 PM, Andre Pany wrote:
 On Thursday, 16 July 2020 at 05:03:36 UTC, Jon Degenhardt wrote:
 On Wednesday, 15 July 2020 at 07:12:35 UTC, Andre Pany wrote:
 [...]
An enhancement is likely to hit some corner-cases involving list termination requiring choices that are not fully generic. Any time a legal list value looks like a legal option. Perhaps the most important case is single digit numeric options like '-1', '-2'. These are legal short form options, and there are programs that use them. They are also somewhat common numeric values to include in command lines inputs. [...]
My naive implementation would be that any dash would stop the list of multiple values. If you want to have a value containing a space or a dash, you enclose it with double quotes in the terminal.
Enclose with double quotes in the terminal does nothing: myapp --modelicalibs "file-a.mo" "file-b.mo" will give you EXACTLY the same string[] args as: myapp --modelicalibs file-a.mo file-b.mo I think Jon's point is that it's difficult to distinguish where an array list ends if you get the parameters as separate items. Like: myapp --numbers 1 2 3 -5 -6 Is that numbers=> [1, 2, 3, -5, -6] or is it numbers=> [1, 2, 3], 5 => true, 6 => true This is probably why the code doesn't support that. -Steve
Jul 16 2020
parent Jon Degenhardt <jond noreply.com> writes:
On Thursday, 16 July 2020 at 17:40:25 UTC, Steven Schveighoffer 
wrote:
 On 7/16/20 1:13 PM, Andre Pany wrote:
 On Thursday, 16 July 2020 at 05:03:36 UTC, Jon Degenhardt 
 wrote:
 On Wednesday, 15 July 2020 at 07:12:35 UTC, Andre Pany wrote:
 [...]
An enhancement is likely to hit some corner-cases involving list termination requiring choices that are not fully generic. Any time a legal list value looks like a legal option. Perhaps the most important case is single digit numeric options like '-1', '-2'. These are legal short form options, and there are programs that use them. They are also somewhat common numeric values to include in command lines inputs. [...]
My naive implementation would be that any dash would stop the list of multiple values. If you want to have a value containing a space or a dash, you enclose it with double quotes in the terminal.
Enclose with double quotes in the terminal does nothing: myapp --modelicalibs "file-a.mo" "file-b.mo" will give you EXACTLY the same string[] args as: myapp --modelicalibs file-a.mo file-b.mo I think Jon's point is that it's difficult to distinguish where an array list ends if you get the parameters as separate items. Like: myapp --numbers 1 2 3 -5 -6 Is that numbers=> [1, 2, 3, -5, -6] or is it numbers=> [1, 2, 3], 5 => true, 6 => true This is probably why the code doesn't support that. -Steve
Yes, this what I was getting. Thanks for the clarification. Also, it's not always immediately obvious what part of the argument splitting is being done by the shell, and what is being done by the program/getopt. Taking inspiration from the recent one-liners, here's way to see how the program gets the args from the shell for different command lines: $ echo 'import std.stdio; void main(string[] args) { args[1 .. $].writeln; }' | dmd -run - --numbers 1,2,3,-5,-6 ["--numbers", "1,2,3,-5,-6"] $ echo 'import std.stdio; void main(string[] args) { args[1 .. $].writeln; }' | dmd -run - --numbers 1 2 3 -5 -6 ["--numbers", "1", "2", "3", "-5", "-6"] $ echo 'import std.stdio; void main(string[] args) { args[1 .. $].writeln; }' | dmd -run - --numbers "1" "2" "3" "-5" "-6" ["--numbers", "1", "2", "3", "-5", "-6"] $ echo 'import std.stdio; void main(string[] args) { args[1 .. $].writeln; }' | dmd -run - --numbers '1 2 3 -5 -6' ["--numbers", "1 2 3 -5 -6"] The first case is what getopt supports now - All the values in a single string with a separator that getopt splits on. The 2nd and 3rd are identical from the program's perspective (Steve's point), but they've already been split, so getopt would need a different approach. And requires dealing with ambiguity. The fourth form eliminates the ambiguity, but puts the burden on the user to use quotes.
Jul 16 2020