digitalmars.D - Am I reading this wrong, or is std.getopt *really* this stupid?

digitalmars.D - Am I reading this wrong, or is std.getopt really this stupid?

H. S. Teoh (35/35) Mar 23 2018 I just ran into this seemingly small problem:

Andrei Alexandrescu (15/29) Mar 23 2018 Affirmative. The implementation is quadratic (including a removal of the...

Chris Katko (15/53) Mar 23 2018 Is there a possibility of improving this function?

Seb (18/79) Mar 23 2018 Yes, Bugzilla is full of excellent ideas:

rumbu (10/27) Mar 23 2018 I saw this kind of "call to arms" post spreading around the forum

Andrei Alexandrescu (27/75) Mar 24 2018 At a point where a realistic benchmarks shows a need. Without a

H. S. Teoh (22/24) Mar 24 2018 [...]

Andrei Alexandrescu (16/46) Mar 24 2018 I'd have a difficult time interpreting the following as not

H. S. Teoh (28/46) Mar 24 2018 So what about making this configurable?

Andrei Alexandrescu (4/15) Mar 24 2018 That'd be great. I'm thinking something like an option

H. S. Teoh (7/23) Mar 24 2018 Great!
Johannes Pfau (14/31) Mar 25 2018 I don't really understand why you want to this keep lexical order

Andrei Alexandrescu (3/5) Mar 25 2018 I don't want. I think others will, once their programs depending on the

Adam D. Ruppe (8/10) Mar 26 2018 The current semantics are not documented, so any program that

Abdulhaq (8/11) Mar 25 2018 I thought this was a clever joke, but everyone is taking it

Abdulhaq (2/14) Mar 25 2018 Oops sorry to reply to myself, I realise my mistake now :-)

Andrei Alexandrescu (2/16) Mar 25 2018 To purge thy mistake: implement :o).

Jon Degenhardt (54/62) Mar 24 2018 Several of the TSV tools I built rely on command-line order.
Jonathan M Davis (18/23) Mar 24 2018 I think that there are at least a couple alternatives to std.getopt on

Seb (58/70) Mar 24 2018 Yeah I have "dumb XYZ, roll my own" experience often too.

Vladimir Panteleev (6/10) Mar 25 2018 funopt is based on getopt underneath, so this issue still applies
Adam D. Ruppe (21/24) Mar 26 2018 In my case, there's very little overlap with what Phobos offers.

H. S. Teoh (11/26) Mar 24 2018 [...]
H. S. Teoh (8/14) Mar 25 2018 [...]

Walter Bright (3/5) Mar 25 2018 And here it is:

Rubn (11/17) Mar 25 2018 Not a very comprehensive list. Virtually all of those issues have

Seb (32/52) Mar 25 2018 Well, first off - most of these issues are bug reports and would
Walter Bright (9/18) Mar 25 2018 Hence there's plenty of "need to be done" contributions to make!

Adam D. Ruppe (17/18) Mar 24 2018 The way I'd do this is to only use getopt to build the lists,

H. S. Teoh (6/33) Mar 24 2018 Touche. This uglifies the code a bit, but meh. It's just main(), no

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

I just ran into this seemingly small problem:

	void main(string[] args) {
		string[] searchPaths;
		getopt(args,
			"l", (string opt, string arg) {
				// searches through searchPaths
				openFile(arg);
			},
			"I", (string opt, string arg) {
				searchPaths ~= arg; 
			},
			...
		);
	}

Running the program with:

	program -I /path/to -l myfile

causes a runtime error that 'myfile' cannot be found, even though it
actually exists in /path/to/*.  I thought it was odd, since obviously
the -I is parsed before the -l, so searchPaths should already be set
when -l is seen, right?

Well, looking at the implementation of std.getopt turned up the
disturbing fact that the program's argument list is actually scanned
*multiple times*, one for each possible option(!).  Besides the bogonity
that whether or not searchPaths will be set prior to finding -l depends
on the order of arguments passed to getopt(), this also represents an
O(n*m) complexity in scanning program arguments, where n = number of
arguments and m = number of possible options.

And this is not to mention the fact that getoptImpl is *recursive
template*.  Why, oh why?

Am I the only one who thinks the current implementation of getopt() is
really stupid??  Can somebody please talk some sense into me, or point
out something really obvious that I'm missing?


T

-- 
Государство делает вид, что платит нам
зарплату, а мы делаем вид, что работаем.

Mar 23 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:

On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually scanned
 *multiple times*, one for each possible option(!).  Besides the bogonity
 that whether or not searchPaths will be set prior to finding -l depends
 on the order of arguments passed to getopt(), this also represents an
 O(n*m) complexity in scanning program arguments, where n = number of
 arguments and m = number of possible options.
 
 And this is not to mention the fact that getoptImpl is *recursive
 template*.  Why, oh why?
 
 Am I the only one who thinks the current implementation of getopt() is
 really stupid??  Can somebody please talk some sense into me, or point
 out something really obvious that I'm missing?

Affirmative. The implementation is quadratic (including a removal of the 
option from the string). This is intentional, i.e. understood and 
acknowledged while I was working on it. Given that the function is only 
called once per run and with a number of arguments at most in the 
dozens, by the time its complexity becomes an issue the function is long 
beyond its charter.

This isn't the only instance of quadratic algorithms in Phobos. 
Quicksort uses an insertion sort - a quadratic algorithm - for 25 
elements or fewer. That algorithm may do 600 comparisons in the worst 
case, and it's potentially that many for each group of 25 elements in a 
large array.

Spending time on improving the speed of getopt is unrecommended. Such 
work would add no value.


Andrei

Mar 23 2018

Chris Katko <CKATKO GMAIL.COM> writes:

On Saturday, 24 March 2018 at 03:04:41 UTC, Andrei Alexandrescu 
wrote:
 On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually 
 scanned
 *multiple times*, one for each possible option(!).  Besides 
 the bogonity
 that whether or not searchPaths will be set prior to finding 
 -l depends
 on the order of arguments passed to getopt(), this also 
 represents an
 O(n*m) complexity in scanning program arguments, where n = 
 number of
 arguments and m = number of possible options.
 
 And this is not to mention the fact that getoptImpl is 
 *recursive
 template*.  Why, oh why?
 
 Am I the only one who thinks the current implementation of 
 getopt() is
 really stupid??  Can somebody please talk some sense into me, 
 or point
 out something really obvious that I'm missing?

 Affirmative. The implementation is quadratic (including a 
 removal of the option from the string). This is intentional, 
 i.e. understood and acknowledged while I was working on it. 
 Given that the function is only called once per run and with a 
 number of arguments at most in the dozens, by the time its 
 complexity becomes an issue the function is long beyond its 
 charter.

 This isn't the only instance of quadratic algorithms in Phobos. 
 Quicksort uses an insertion sort - a quadratic algorithm - for 
 25 elements or fewer. That algorithm may do 600 comparisons in 
 the worst case, and it's potentially that many for each group 
 of 25 elements in a large array.

 Spending time on improving the speed of getopt is 
 unrecommended. Such work would add no value.


 Andrei

Is there a possibility of improving this function?

  - While quadratic, for low N, quadratic isn't a big deal. So at 
what point does quadratic for this function become "a problem"?

  - If it is a problem, what's stopping someone from improving it?

Last question though, is there any kind of list of features, and 
minor features and fixes that can or need to be done? Perhaps it 
already exists, but it seems like it'd be great to have a wiki of 
contribution sites (like this function) that someone could just 
browse and go "Hey, I know how to do X, maybe I'll take a crack 
at it." That way, devs who don't have time to improve something 
"low on the list" could still outsource it in a clear list 
instead of people who just happen to see it on the forum at the 
right place right time.

Mar 23 2018

Seb <seb wilzba.ch> writes:

On Saturday, 24 March 2018 at 05:55:53 UTC, Chris Katko wrote:
 On Saturday, 24 March 2018 at 03:04:41 UTC, Andrei Alexandrescu 
 wrote:
 On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up 
 the
 disturbing fact that the program's argument list is actually 
 scanned
 *multiple times*, one for each possible option(!).  Besides 
 the bogonity
 that whether or not searchPaths will be set prior to finding 
 -l depends
 on the order of arguments passed to getopt(), this also 
 represents an
 O(n*m) complexity in scanning program arguments, where n = 
 number of
 arguments and m = number of possible options.
 
 And this is not to mention the fact that getoptImpl is 
 *recursive
 template*.  Why, oh why?
 
 Am I the only one who thinks the current implementation of 
 getopt() is
 really stupid??  Can somebody please talk some sense into me, 
 or point
 out something really obvious that I'm missing?

 Affirmative. The implementation is quadratic (including a 
 removal of the option from the string). This is intentional, 
 i.e. understood and acknowledged while I was working on it. 
 Given that the function is only called once per run and with a 
 number of arguments at most in the dozens, by the time its 
 complexity becomes an issue the function is long beyond its 
 charter.

 This isn't the only instance of quadratic algorithms in 
 Phobos. Quicksort uses an insertion sort - a quadratic 
 algorithm - for 25 elements or fewer. That algorithm may do 
 600 comparisons in the worst case, and it's potentially that 
 many for each group of 25 elements in a large array.

 Spending time on improving the speed of getopt is 
 unrecommended. Such work would add no value.


 Andrei

 Is there a possibility of improving this function?

  - While quadratic, for low N, quadratic isn't a big deal. So 
 at what point does quadratic for this function become "a 
 problem"?

  - If it is a problem, what's stopping someone from improving 
 it?

 Last question though, is there any kind of list of features, 
 and minor features and fixes that can or need to be done? 
 Perhaps it already exists, but it seems like it'd be great to 
 have a wiki of contribution sites (like this function) that 
 someone could just browse and go "Hey, I know how to do X, 
 maybe I'll take a crack at it." That way, devs who don't have 
 time to improve something "low on the list" could still 
 outsource it in a clear list instead of people who just happen 
 to see it on the forum at the right place right time.

Yes, Bugzilla is full of excellent ideas:

https://issues.dlang.org/buglist.cgi?component=phobos&list_id=220544&product=D&resolution=---

There are even some tags like "bootcamp" for someone who is 
looking to get started:

https://issues.dlang.org/buglist.cgi?component=phobos&keywords=bootcamp%2C%20preapproved&keywords_type=anywords&list_id=220545&product=D&query_format=advanced&resolution=---

We have also recently started to experiment with GitHub's new 
project dashboards. Currently they are tracking projects like 
improving the documentation,  safe-ty, DIP1000 etc.:

https://github.com/dlang/phobos/projects

DMD has a similar set which is based on Walter's recent post [1]

https://github.com/dlang/dmd/projects

Last, but not least there's a "Get involved" guide at the wiki:

https://wiki.dlang.org/Get_involved

As you couldn't find any of these pages, please let us know where 
you looked first, so that maybe we can make it easier for future 
people to find this information ;-)

[1] https://forum.dlang.org/post/p6oibo$1lmi$1 digitalmars.com

Mar 23 2018

rumbu <rumbu rumbu.ro> writes:

On Saturday, 24 March 2018 at 06:04:23 UTC, Seb wrote:

 Yes, Bugzilla is full of excellent ideas:

 https://issues.dlang.org/buglist.cgi?component=phobos&list_id=220544&product=D&resolution=---

 There are even some tags like "bootcamp" for someone who is 
 looking to get started:

 https://issues.dlang.org/buglist.cgi?component=phobos&keywords=bootcamp%2C%20preapproved&keywords_type=anywords&list_id=220545&product=D&query_format=advanced&resolution=---

 We have also recently started to experiment with GitHub's new 
 project dashboards. Currently they are tracking projects like 
 improving the documentation,  safe-ty, DIP1000 etc.:

 https://github.com/dlang/phobos/projects

 DMD has a similar set which is based on Walter's recent post [1]

 https://github.com/dlang/dmd/projects

 Last, but not least there's a "Get involved" guide at the wiki:

 https://wiki.dlang.org/Get_involved

 As you couldn't find any of these pages, please let us know 
 where you looked first, so that maybe we can make it easier for 
 future people to find this information ;-)

 [1] https://forum.dlang.org/post/p6oibo$1lmi$1 digitalmars.com

I saw this kind of "call to arms" post spreading around the forum 
in the last days. Very nice to have some kind of plan, but before 
being involved, I really would like to know how this can be done 
on Windows. And please don't redirect me to the wiki, the 
information there is clearly outdated and Linux oriented (at 
least the test stuff)

https://forum.dlang.org/post/gxxfmrnezfrlodlhpiwe forum.dlang.org
https://forum.dlang.org/post/tulbhulbeqqzofdxevcg forum.dlang.org

Thanks.

Mar 23 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/24/2018 01:55 AM, Chris Katko wrote:
 On Saturday, 24 March 2018 at 03:04:41 UTC, Andrei Alexandrescu wrote:
 On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually scanned
 *multiple times*, one for each possible option(!).  Besides the bogonity
 that whether or not searchPaths will be set prior to finding -l depends
 on the order of arguments passed to getopt(), this also represents an
 O(n*m) complexity in scanning program arguments, where n = number of
 arguments and m = number of possible options.

 And this is not to mention the fact that getoptImpl is *recursive
 template*.  Why, oh why?

 Am I the only one who thinks the current implementation of getopt() is
 really stupid??  Can somebody please talk some sense into me, or point
 out something really obvious that I'm missing?

 Affirmative. The implementation is quadratic (including a removal of 
 the option from the string). This is intentional, i.e. understood and 
 acknowledged while I was working on it. Given that the function is 
 only called once per run and with a number of arguments at most in the 
 dozens, by the time its complexity becomes an issue the function is 
 long beyond its charter.

 This isn't the only instance of quadratic algorithms in Phobos. 
 Quicksort uses an insertion sort - a quadratic algorithm - for 25 
 elements or fewer. That algorithm may do 600 comparisons in the worst 
 case, and it's potentially that many for each group of 25 elements in 
 a large array.

 Spending time on improving the speed of getopt is unrecommended. Such 
 work would add no value.


 Andrei

 
 Is there a possibility of improving this function?

Most likely it can be improved in terms of features.

   - While quadratic, for low N, quadratic isn't a big deal. So at what 
 point does quadratic for this function become "a problem"?

At a point where a realistic benchmarks shows a need. Without a 
motivating measurement, making getopt faster would be a waste of time.

I mentioned another function: sort. For that, YES, the are ways of 
improving it. In fact, right after posting my message, I couldn't sleep 
thinking of a number of ways to improve the short sort part. We have 
precise benchmarks measuring the number of comparisons and swaps 
performed by our implementation of sort. Improving its performance lifts 
a lot of boats - many high-level algorithms use sort as an essential 
building block. There's a world of difference in impact of the speed of 
sort vs. speed of getopt.

Here are a few ideas for improving the small array sort part (the last 
mile of sort):

* Currently we switch to short sort when the number of elements to sort 
is smaller than max(32, 1024 / Elem.sizeof). Probably a better choice 
can be found through experimentation.

* Insertion sort does linear search in the already-sorted portion. 
Probably galloping search would fare better.

* Insertion sort starts from the end and grows the sorted portion down. 
Starting from the middle of the array and growing left and right 
simultaneously would slash the number of comparisons and swaps in half.

* There could be other algorithms better for short arrays, such as 
specialized versions of heapsort or smoothsort.

   - If it is a problem, what's stopping someone from improving it?

Hopefully very little.

 Last question though, is there any kind of list of features, and minor 
 features and fixes that can or need to be done? Perhaps it already 
 exists, but it seems like it'd be great to have a wiki of contribution 
 sites (like this function) that someone could just browse and go "Hey, I 
 know how to do X, maybe I'll take a crack at it." That way, devs who 
 don't have time to improve something "low on the list" could still 
 outsource it in a clear list instead of people who just happen to see it 
 on the forum at the right place right time.

Seb gave a great answer - thanks!


Andrei

Mar 24 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Mar 24, 2018 at 08:27:48AM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
 At a point where a realistic benchmarks shows a need. Without a motivating
 measurement, making getopt faster would be a waste of time.

[...]

Guys, for crying out loud, my original complaint was not *performance*,
but that the (strange) choice of algorithm for getopt resulted in the
very counterintuitive behaviour that the order options are processed
depends on the order of option declarations rather than the order they
were specified on the command-line. This makes it basically impossible
to implement certain styles of option processing, such as that employed
in the popular ImageMagick suite of tools, where it matters that options
are processed in the order they are specified by the user, rather than
some arbitrary (to a user who doesn't and shouldn't care to know the
code) predetermined order.

My complaint about the quadratic algorithm was not in the fact that it's
quadratic, but that it exhibited this strange (and annoying!) behaviour,
especially since the saner (IMO) non-quadratic algorithm would have been
the expected choice in the first place, that would *not* have had this
problem. It felt almost like we went out of our way just to make things
counterintuitive, with slowness added as a cherry on top.


T

-- 
"You are a very disagreeable person." "NO."

Mar 24 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/24/2018 09:36 AM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 08:27:48AM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 At a point where a realistic benchmarks shows a need. Without a motivating
 measurement, making getopt faster would be a waste of time.

 [...]
 
 Guys, for crying out loud, my original complaint was not *performance*,
 but that the (strange) choice of algorithm for getopt resulted in the
 very counterintuitive behaviour that the order options are processed
 depends on the order of option declarations rather than the order they
 were specified on the command-line.

I'd have a difficult time interpreting the following as not 
performance-related:

 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually scanned
 *multiple times*, one for each possible option(!).  Besides the bogonity
 that whether or not searchPaths will be set prior to finding -l depends
 on the order of arguments passed to getopt(), this also represents an
 O(n*m) complexity in scanning program arguments, where n = number of
 arguments and m = number of possible options.

Anyhow. Right now the order of processing is the same as the lexical 
order in which flags are passed to getopt. There may be use cases for 
which that's the more desirable way to go about things, so if you author 
a PR to change the order you'd need to build an argument on why 
command-line order is better. FWIW the traditional POSIX doctrine makes 
behavior of flags independent of their order, which would imply the 
current choice is more natural.

 This makes it basically impossible
 to implement certain styles of option processing, such as that employed
 in the popular ImageMagick suite of tools, where it matters that options
 are processed in the order they are specified by the user, rather than
 some arbitrary (to a user who doesn't and shouldn't care to know the
 code) predetermined order.

This is an exaggeration. Yes you can't process with lambdas. You can 
always collect options first, process after. This is a well-supported 
use case.

 My complaint about the quadratic algorithm was not in the fact that it's
 quadratic, but that it exhibited this strange (and annoying!) behaviour,
 especially since the saner (IMO) non-quadratic algorithm would have been
 the expected choice in the first place, that would *not* have had this
 problem. It felt almost like we went out of our way just to make things
 counterintuitive, with slowness added as a cherry on top.

I want to be convinced. I think you'd need to build a better case on why 
you consider one behavior intuitive and the other counterintuitive.


Andrei

Mar 24 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
 Anyhow. Right now the order of processing is the same as the lexical
 order in which flags are passed to getopt. There may be use cases for
 which that's the more desirable way to go about things, so if you
 author a PR to change the order you'd need to build an argument on why
 command-line order is better. FWIW the traditional POSIX doctrine
 makes behavior of flags independent of their order, which would imply
 the current choice is more natural.

So what about making this configurable?

And documented?  Last time I checked, this was not clearly stated in the
docs.


[...]
 My complaint about the quadratic algorithm was not in the fact that
 it's quadratic, but that it exhibited this strange (and annoying!)
 behaviour, especially since the saner (IMO) non-quadratic algorithm
 would have been the expected choice in the first place, that would
 *not* have had this problem. It felt almost like we went out of our
 way just to make things counterintuitive, with slowness added as a
 cherry on top.

 
 I want to be convinced. I think you'd need to build a better case on
 why you consider one behavior intuitive and the other
 counterintuitive.

[...]

Honestly, I've wasted far too much time writing about this on the forum
already.  In the time it took to argue about this, I could have already
written my own version of getopt that does what I want, instead of
fighting with strange design decisions in Phobos.  I'm not going to
waste any more time arguing about this, since, after all, it *is* "just"
getopt().

This was not the only issue I struggled with, as std.getopt has other
design differences incompatible with Posix getopt() that makes it hard
to support the original semantics of a previous C++ project ported to D.
Yes, I could have used Posix getopt() from D, but that requires some
ugly shim code, tons of toStringz/fromStringz, doesn't take advantage of
things like automatic enum conversions, etc., which sux given that we're
in D, not C++.

And given the defensiveness surrounding std.getopt, my conclusion can
only be: dump std.getopt, roll my own.  It's sad, since in general
Phobos design tends to be superior to its C++ counterparts.  But we then
have warts like std.getopt that people refuse to acknowledge is a
problem.  So be it.


T

-- 
In a world without fences, who needs Windows and Gates? -- Christian Surchi

Mar 24 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/24/18 12:59 PM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 Anyhow. Right now the order of processing is the same as the lexical
 order in which flags are passed to getopt. There may be use cases for
 which that's the more desirable way to go about things, so if you
 author a PR to change the order you'd need to build an argument on why
 command-line order is better. FWIW the traditional POSIX doctrine
 makes behavior of flags independent of their order, which would imply
 the current choice is more natural.

 
 So what about making this configurable?

That'd be great. I'm thinking something like an option 
std.getopt.config.commandLineOrder. Must be first option specified right 
after arguments. Sounds good?

Mar 24 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Mar 24, 2018 at 05:24:28PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 3/24/18 12:59 PM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 Anyhow. Right now the order of processing is the same as the
 lexical order in which flags are passed to getopt. There may be
 use cases for which that's the more desirable way to go about
 things, so if you author a PR to change the order you'd need to
 build an argument on why command-line order is better. FWIW the
 traditional POSIX doctrine makes behavior of flags independent of
 their order, which would imply the current choice is more natural.

 
 So what about making this configurable?

 
 That'd be great. I'm thinking something like an option
 std.getopt.config.commandLineOrder. Must be first option specified
 right after arguments. Sounds good?

Great!

Not so sure how easy it is to implement while supporting everything
else, though, given the current structure of the code.


T

-- 
Time flies like an arrow. Fruit flies like a banana.

Mar 24 2018

Johannes Pfau <nospam example.com> writes:

Am Sat, 24 Mar 2018 17:24:28 -0400 schrieb Andrei Alexandrescu:

 On 3/24/18 12:59 PM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via
 Digitalmars-d wrote:
 [...]
 Anyhow. Right now the order of processing is the same as the lexical
 order in which flags are passed to getopt. There may be use cases for
 which that's the more desirable way to go about things, so if you
 author a PR to change the order you'd need to build an argument on why
 command-line order is better. FWIW the traditional POSIX doctrine
 makes behavior of flags independent of their order, which would imply
 the current choice is more natural.

 
 So what about making this configurable?

 
 That'd be great. I'm thinking something like an option
 std.getopt.config.commandLineOrder. Must be first option specified right
 after arguments. Sounds good?

I don't really understand why you want to this keep lexical order 
functionality. There's a well defined use case for command line order: 
Allowing users to write commands in a natural, left-to-right style, where 
options on the right are more specific: systemctl status -l ...

I've never heard of any use case where the lexical order of the arguments 
passed to getopt matters for parsing user supplied command arguments. Is 
there any use case for this?

I thought the only reason we have this lexical order parsing is because 
it's simpler to implement. But if we'll get the non-quadratic command-line 
order implementation there's no reason to keep and maintain the quadratic 
implementation.

-- 
Johannes

Mar 25 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/25/18 10:22 AM, Johannes Pfau wrote:
 I don't really understand why you want to this keep lexical order
 functionality.

I don't want. I think others will, once their programs depending on the 
current semantics will have trouble.

Mar 25 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 25 March 2018 at 14:25:49 UTC, Andrei Alexandrescu 
wrote:
 I don't want. I think others will, once their programs 
 depending on the current semantics will have trouble.

The current semantics are not documented, so any program that 
relies on them is foolish anyway.

Like I said in my code, I read options and run them in separate 
orders since I specifically want control - I imagine most 
everyone else does too, since otherwise you are depending on 
underspecified behavior and liable to break without notice.

Mar 26 2018

Abdulhaq <alynch4047 gmail.com> writes:

On Saturday, 24 March 2018 at 21:24:28 UTC, Andrei Alexandrescu 
wrote:
 That'd be great. I'm thinking something like an option 
 std.getopt.config.commandLineOrder. Must be first option 
 specified right after arguments. Sounds good?

I thought this was a clever joke, but everyone is taking it 
seriously ?!

"When running mygreatprog.exe, always run with 
--command-line-order CommandLine as the first command line 
option, otherwise mygreatprog.exe may misinterpret the command 
line"

Mar 25 2018

Abdulhaq <alynch4047 gmail.com> writes:

On Sunday, 25 March 2018 at 14:46:23 UTC, Abdulhaq wrote:
 On Saturday, 24 March 2018 at 21:24:28 UTC, Andrei Alexandrescu 
 wrote:
 That'd be great. I'm thinking something like an option 
 std.getopt.config.commandLineOrder. Must be first option 
 specified right after arguments. Sounds good?

 I thought this was a clever joke, but everyone is taking it 
 seriously ?!

 "When running mygreatprog.exe, always run with 
 --command-line-order CommandLine as the first command line 
 option, otherwise mygreatprog.exe may misinterpret the command 
 line"

Oops sorry to reply to myself, I realise my mistake now :-)

Mar 25 2018

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/25/18 10:48 AM, Abdulhaq wrote:
 On Sunday, 25 March 2018 at 14:46:23 UTC, Abdulhaq wrote:
 On Saturday, 24 March 2018 at 21:24:28 UTC, Andrei Alexandrescu wrote:
 That'd be great. I'm thinking something like an option 
 std.getopt.config.commandLineOrder. Must be first option specified 
 right after arguments. Sounds good?

 I thought this was a clever joke, but everyone is taking it seriously ?!

 "When running mygreatprog.exe, always run with --command-line-order 
 CommandLine as the first command line option, otherwise 
 mygreatprog.exe may misinterpret the command line"

 
 Oops sorry to reply to myself, I realise my mistake now :-)

To purge thy mistake: implement :o).

Mar 25 2018

Jon Degenhardt <jond noreply.com> writes:

On Saturday, 24 March 2018 at 16:11:18 UTC, Andrei Alexandrescu
wrote:
Anyhow. Right now the order of processing is the same as the
lexical order in which flags are passed to getopt. There may be
use cases for which that's the more desirable way to go about
things, so if you author a PR to change the order you'd need to
build an argument on why command-line order is better. FWIW the
traditional POSIX doctrine makes behavior of flags independent
of their order, which would imply the current choice is more
natural.

Several of the TSV tools I built rely on command-line order.
There is an enhancement request here:
https://issues.dlang.org/show_bug.cgi?id=16539.

A few of the tools use a paradigm where the user is entering a
series instructions on the command line, and there are times when
the user entered order matters. Two general cases:

* Display/output order - The tool produces delimited output, and
the user wants to control the order. The order of command line
options determines the order.

* Short-circuiting - tsv-filter in particular allows numeric
tests like less-than, but also allow the user to short-circuit
the test by testing if the data contains a valid number prior to
making the numeric test. This is done by evaluating the command
line arguments in left-to-right order.

Short-circuiting is supported the Unix `find` utility.

I have used this approach for CLI tools I've written in other
languages. Perl's Getopt::Long processes args in command-line, so
it supports this.

I considered submitting a PR to getopt to change this, but
decided against it. The approach used looks like it is central to
the design, and changing it in a backward compatible way would be
a meaningful undertaking. Instead I wrote a cover to getopt that
processes arguments in command-line order. It is here:
https://github.com/eBay/tsv-utils-dlang/blob/master/common/s
c/getopt_inorder.d. It handles most of what std.getopt handles.

The TSV utilities documentation should help illustrate these
cases. tsv-filter use short circuiting:
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#t
v-filter-reference. Look for "Short-circuiting expressions" toward the bottom
of the section.

tsv-summarize obeys the command-line order for output/display.
See:
https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#tsv-summarize-reference.

There's one other general limitation I encountered with the
current compile-time approach to command-line argument
processing. I couldn't find a clean way to allow it to be
extended in a plug-in manner.

In particular, the original goal for the tsv-summarize tool was
to allow users to create custom operators. The tool has a fair
number of built-in operators, like median, sum, min, max, etc.
Each of these operators has a getopt arg invoking it, eg.
'--median', '--sum', etc. However, it is common for people to
have custom analysis needs, so allowing extension of the set
would be quite useful.

The code is setup to allow this. People would clone the repo,
write their own operator, placed in a separate file they
maintain, and rebuild. However, I couldn't figure out a clean way
to allow additions to command line argument set. There may be a
reasonable way and I just couldn't find it, but my current
thinking is that I need to write my own command line argument
handler to support this idea.

I think handling command line argument processing at run-time
would make this simpler, at the cost loosing some compile-time
validation.

--Jon

Mar 24 2018

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Saturday, March 24, 2018 09:59:44 H. S. Teoh via Digitalmars-d wrote:
 And given the defensiveness surrounding std.getopt, my conclusion can
 only be: dump std.getopt, roll my own.  It's sad, since in general
 Phobos design tends to be superior to its C++ counterparts.  But we then
 have warts like std.getopt that people refuse to acknowledge is a
 problem.  So be it.

I think that there are at least a couple alternatives to std.getopt on
code.dlang.org if you want alternatives. Personally, the only complaints
I've had with std.getopt is that bundling isn't the default and that it's
not always easy to figure out whether an argument has been set or not. But
at least the bundling can be configured, and getopt can probably be improved
to work with Nullable so that it'll be easier to figure out whether an
argument has been set.

As for defensiveness, I'm not quite sure what you're referring to. The main
point was that given how often getopt gets called in a program, improving
its Big-O complexity isn't worth it, but there have been a number of
improvements to getopt over the years, so it's not like we're not allowed to
improve it. It's just that improving its Big-O complexity is kind of
pointless. In any case, as Andrei said, if a new option can be added to fix
your use case, then that shouldn't be a problem, though I have no clue how
much of a pain that will be to implement, particularly since std.getopt
isn't exactly simple.

- Jonathan M Davis

Mar 24 2018

Seb <seb wilzba.ch> writes:

On Sunday, 25 March 2018 at 04:30:31 UTC, Jonathan M Davis wrote:
 On Saturday, March 24, 2018 09:59:44 H. S. Teoh via 
 Digitalmars-d wrote:
 And given the defensiveness surrounding std.getopt, my 
 conclusion can only be: dump std.getopt, roll my own.  It's 
 sad, since in general Phobos design tends to be superior to


Yeah I have "dumb XYZ, roll my own" experience often too.
As there are already many big libraries like `arsd` or `ae` out 
there, I don't think I'm the only one with these feeling.
I wonder if someone ever tries to fork/reboot Phobos with all the 
goodies, but without the legacy cruft like auto-decoding and 
similar friends whose breaking changes can't be made.

 its C++ counterparts.  But we then have warts like std.getopt 
 that people refuse to acknowledge is a problem.  So be it.

 I think that there are at least a couple alternatives to 
 std.getopt on code.dlang.org if you want alternatives.

Yes, two good ones are:

https://blog.thecybershadow.net/2014/08/05/ae-utils-funopt
http://code.dlang.org/packages/darg

 Personally, the only complaints I've had with std.getopt is

Hehe I like many things about std.getopt, but it's not perfect 
either.
A few examples:

- I often just want to map CLI arguments to a config object where 
using UDAs would more natural and less boilerplate

     struct Config {
         option("c|compiler")
        string compiler;
     }

Now with the rejected/postponeed __traits(documentation) the ddoc 
help text could be automatically read and put into the 
auto-generated CLI help.

- I don't like to manually check for .helpWanted

Imho I constantly find myself doing this:

     if (helpInformation.helpWanted)
     {
`DDoc wrapper
All unknown options are passed to the compiler.
./ddoc <file>...
`.defaultGetoptPrinter(helpInformation.options);
         return 1;
     }

I would have preferred this being the default behavior or at 
least the default behavior if a help text string is explicitly 
provided e.g. like:

     getopt(`My program
./program ...`, args, ...);

or maybe with sth. like `.withHelp("")`

- setting shared configs doesn't work

I know, I could use a TLS config or use an atomicOp or 
synchronized assignment to set it, but often casting is easier 
and that's rather ugly:

https://github.com/dlang/dlang.org/blob/master/dspec_tester.d#L101

- in theory getopt should be  safe and use ref

It just does a bit of string manipulation, but it looks like we 
have to wait until DIP1000 for this:

https://github.com/dlang/phobos/pull/6281

Also similarly to std.stdio.read or std.format.formattedRead 
there's no need to use pointers, D's  safe ref would have worked 
too. Now, it looks like this change can't be made anymore as it 
would be a breaking one due to ambiguities.

- it would be really cool to support generating zsh/bash 
completions files

This is the last point on my list as it's not really a limitation 
of std.getopt and GetoptResult should be enough for this, but it 
looks like no one bothered enough to write a zshGetoptPrinter so 
far.

 getopt can probably be improved to work with Nullable so that 
 it'll be easier to figure out whether an argument has been set.

Yes, supporting Nullable would really cool!

Mar 24 2018

Vladimir Panteleev <thecybershadow.lists gmail.com> writes:

On Sunday, 25 March 2018 at 06:58:50 UTC, Seb wrote:
 I think that there are at least a couple alternatives to 
 std.getopt on code.dlang.org if you want alternatives.

 Yes, two good ones are:

 https://blog.thecybershadow.net/2014/08/05/ae-utils-funopt

funopt is based on getopt underneath, so this issue still applies 
to it, sorry!

Well, funopt translates options to function arguments, so there's 
no way to specify a delegate anyway, but at least the performance 
aspect applies.

Mar 25 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Sunday, 25 March 2018 at 06:58:50 UTC, Seb wrote:
 Yeah I have "dumb XYZ, roll my own" experience often too.
 As there are already many big libraries like `arsd` or `ae` out 
 there, I don't think I'm the only one with these feeling.

In my case, there's very little overlap with what Phobos offers. 
And in cases where there are, it is usually either 1) built on 
top of phobos (e.g. my jsvar actually depends on std.json), 
and/or 2) older than the Phobos offering, often significantly so.

Well, there's also a few explicitly redone functions like `to` 
hidden inside my color.d, but that's a dependency bloat thing and 
less relevant now that phobos is getting its own messy import web 
under control (this came when importing the most trivial phobos 
module would triple the build time and double the executable size 
of all my gui apps, so the gui module tree in my libs were all 
phobos free, and that did mean a few trivial reimplementations 
but to!string(int) is like a five liner sooo easy trade there. 
But now the situation is much better.)


But anyway, what Phobos does, it tends to do reasonably well in 
my view (with a couple glaring exceptions). I kinda like 
std.getopt. It isn't perfect and I could do better... but it is 
there and it is good enough, so I defend it.

Phobos just doesn't even attempt most of what I need, so I also 
have a LOT of reusable add-on code that i call the arsd repo 
which is kinda part two of my personal D standard library :) .

Mar 26 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Mar 24, 2018 at 10:30:31PM -0600, Jonathan M Davis via Digitalmars-d
wrote:
 On Saturday, March 24, 2018 09:59:44 H. S. Teoh via Digitalmars-d wrote:
 And given the defensiveness surrounding std.getopt, my conclusion
 can only be: dump std.getopt, roll my own.  It's sad, since in
 general Phobos design tends to be superior to its C++ counterparts.
 But we then have warts like std.getopt that people refuse to
 acknowledge is a problem.  So be it.


[...]
 As for defensiveness, I'm not quite sure what you're referring to. The
 main point was that given how often getopt gets called in a program,
 improving its Big-O complexity isn't worth it, but there have been a
 number of improvements to getopt over the years, so it's not like
 we're not allowed to improve it. It's just that improving its Big-O
 complexity is kind of pointless. In any case, as Andrei said, if a new
 option can be added to fix your use case, then that shouldn't be a
 problem, though I have no clue how much of a pain that will be to
 implement, particularly since std.getopt isn't exactly simple.

[...]

OK, the part about defensiveness may be just my overreaction. I
apologize.  But yeah, I glanced at the code, and don't see any easy way
to implement what Andrei agreed with. It's just too much work for
something I could just write for myself in a much shorter time. I guess
I'll just log an enhancement request in bugzilla and leave it at that.


T

-- 
It always amuses me that Windows has a Safe Mode during bootup. Does that mean
that Windows is normally unsafe?

Mar 24 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Mar 24, 2018 at 10:05:36PM -0700, H. S. Teoh via Digitalmars-d wrote:
[...]
 OK, the part about defensiveness may be just my overreaction. I
 apologize.  But yeah, I glanced at the code, and don't see any easy
 way to implement what Andrei agreed with. It's just too much work for
 something I could just write for myself in a much shorter time. I
 guess I'll just log an enhancement request in bugzilla and leave it at
 that.

[...]

Turns out, there's already been an issue for this, filed 2 years ago:

https://issues.dlang.org/show_bug.cgi?id=16539


T

-- 
People say I'm arrogant, and I'm proud of it.

Mar 25 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, and minor
features 
 and fixes that can or need to be done? Perhaps it already exists,

And here it is:

https://issues.dlang.org/

Mar 25 2018

Rubn <where is.this> writes:

On Sunday, 25 March 2018 at 09:27:31 UTC, Walter Bright wrote:
 On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, 
 and minor features and fixes that can or need to be done? 
 Perhaps it already exists,

 And here it is:

 https://issues.dlang.org/

Not a very comprehensive list. Virtually all of those issues have 
no comment on them. If it's a feature request you might as assume 
it requires a DIP cause there's no reason to otherwise waste your 
time implementing the feature. Oddly enough there's almost no 
other way to get anyone's attention of whether a feature requires 
a DIP or not unless there's a pull request for the feature. So if 
you want a feature you almost have to risk wasting your time 
implementing it. It's not a very good system, but someone throws 
up some stats about how many issues get solved/pull requests get 
created per month and they conclude that it's working fine.

Mar 25 2018

Seb <seb wilzba.ch> writes:

On Sunday, 25 March 2018 at 13:23:04 UTC, Rubn wrote:
On Sunday, 25 March 2018 at 09:27:31 UTC, Walter Bright wrote:
On 3/23/2018 10:55 PM, Chris Katko wrote:
Last question though, is there any kind of list of features,
and minor features and fixes that can or need to be done?
Perhaps it already exists,

And here it is:

https://issues.dlang.org/

Not a very comprehensive list. Virtually all of those issues
have no comment on them. If it's a feature request you might as
assume it requires a DIP cause there's no reason to otherwise
waste your time implementing the feature.

Well, first off - most of these issues are bug reports and would
obviously be pulled if fixed.
Also, bigger improvements to Phobos don't require a DIP - just
Andrei's approval.
There have been many discussions on this (a recent one:
https://forum.dlang.org/post/mailman.1298.1521583794.3374.digitalma
s-d puremagic.com), but in short it's going to stay like this, but you can
easily shoot Andrei a mail _before_ doing something bigger at Phobos.

Now regarding language features, the DIP process has been
revamped:

https://forum.dlang.org/post/p95hjs$1nf$1 digitalmars.com

Oddly enough there's almost no other way to get anyone's
attention of whether a feature requires a DIP or not unless
there's a pull request for the feature. So if you want a
feature you almost have to risk wasting your time implementing
it.

Improvements to the language will require a DIP. Sometimes (like
e.g. for a new trait) it's possible to get direct approval by
Walter/Andrei on GitHub, but the rule of thumb is that it needs a
DIP.

In doubt, you can discuss a new feature on Slack, IRC or this NG
here.

It's not a very good system, but someone throws up some stats
about how many issues get solved/pull requests get created per
month and they conclude that it's working fine.

You aren't alone:

https://github.com/wilzbach/state-of-d-2018/blob/master/09c:%20What%2C%20if%20anything%2C%20do%20you%20dislike%20about%20D's%20issue%20process%3F

https://github.com/wilzbach/state-of-d-2018/blob/master/09d:%20What%2C%20if%20anything%2C%20has%20prevented%20you%20from%20opening%20an%20issue%3F

There are many improvements hopefully coming up to
issues.dlang.org in the near future:

https://forum.dlang.org/post/tneyowfjewrlrtnqsuvd forum.dlang.org

If you have more specific ideas on what could be done to improve
issues.dlang.org, share them on #dlang_org in Slack, here or in
Bugzilla.

Note:
- Switch to GH issues is a common request and while I also fought
for that one (I even tested an automatic migration to GH), there
are quite some downsides with GH issue tracker and at the moment
the consensus is to give Mozilla's fork of Bugzilla a fair try
- "there are almost no comments" on issues isn't actionable, but
a system/idea to improve this would be.

Mar 25 2018

Walter Bright <newshound2 digitalmars.com> writes:

On 3/25/2018 6:23 AM, Rubn wrote:
 On Sunday, 25 March 2018 at 09:27:31 UTC, Walter Bright wrote:
 On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, and minor 
 features and fixes that can or need to be done? Perhaps it already exists,

 And here it is:

 https://issues.dlang.org/

 
 Virtually all of those issues have no comment on them.

Hence there's plenty of "need to be done" contributions to make!

People often make very valuable contributions to issues in the comments by:

1. producing a reduced test case (the smaller the test case, the easier it is
to 
track down)
2. finding the cause of the bug
3. finding the pull request that introduced the bug
4. connecting to related work
5. any other information helpful in resolving it

Mar 25 2018

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 23 March 2018 at 23:29:48 UTC, H. S. Teoh wrote:
 I just ran into this seemingly small problem:

The way I'd do this is to only use getopt to build the lists, 
then actually process them externally. (lol adding another loop)

string[] searchPaths;
string[] files;

getopt(args,
   "l", &files,
   "I", &searchPaths
);

foreach(file; files)
   openFile(file);


then it is clear what order your operations are done in anyway, 
and you have a chance to perhaps report bad syntax before 
actually doing any real work.

Wouldn't it be weird for example if

$ cat foo.d --help


spat out the contents followed by the help?

Mar 24 2018

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Sat, Mar 24, 2018 at 01:43:10PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Friday, 23 March 2018 at 23:29:48 UTC, H. S. Teoh wrote:
 I just ran into this seemingly small problem:

 
 The way I'd do this is to only use getopt to build the lists, then
 actually process them externally. (lol adding another loop)
 
 string[] searchPaths;
 string[] files;
 
 getopt(args,
   "l", &files,
   "I", &searchPaths
 );
 
 foreach(file; files)
   openFile(file);
 
 
 then it is clear what order your operations are done in anyway, and
 you have a chance to perhaps report bad syntax before actually doing
 any real work.
 
 Wouldn't it be weird for example if
 
 $ cat foo.d --help
 
 spat out the contents followed by the help?

Touche.  This uglifies the code a bit, but meh. It's just main(), no
biggie.


T

-- 
INTEL = Only half of "intelligence".

Mar 24 2018

D Programming

C/C++ Programming

Other

digitalmars.D - Am I reading this wrong, or is std.getopt really this stupid?