www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Am I reading this wrong, or is std.getopt *really* this stupid?

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
I just ran into this seemingly small problem:

	void main(string[] args) {
		string[] searchPaths;
		getopt(args,
			"l", (string opt, string arg) {
				// searches through searchPaths
				openFile(arg);
			},
			"I", (string opt, string arg) {
				searchPaths ~= arg; 
			},
			...
		);
	}

Running the program with:

	program -I /path/to -l myfile

causes a runtime error that 'myfile' cannot be found, even though it
actually exists in /path/to/*.  I thought it was odd, since obviously
the -I is parsed before the -l, so searchPaths should already be set
when -l is seen, right?

Well, looking at the implementation of std.getopt turned up the
disturbing fact that the program's argument list is actually scanned
*multiple times*, one for each possible option(!).  Besides the bogonity
that whether or not searchPaths will be set prior to finding -l depends
on the order of arguments passed to getopt(), this also represents an
O(n*m) complexity in scanning program arguments, where n = number of
arguments and m = number of possible options.

And this is not to mention the fact that getoptImpl is *recursive
template*.  Why, oh why?

Am I the only one who thinks the current implementation of getopt() is
really stupid??  Can somebody please talk some sense into me, or point
out something really obvious that I'm missing?


T

-- 
Государство делает вид, что платит нам
зарплату, а мы делаем вид, что работаем.
Mar 23 2018
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.com> writes:
On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually scanned
 *multiple times*, one for each possible option(!).  Besides the bogonity
 that whether or not searchPaths will be set prior to finding -l depends
 on the order of arguments passed to getopt(), this also represents an
 O(n*m) complexity in scanning program arguments, where n = number of
 arguments and m = number of possible options.
 
 And this is not to mention the fact that getoptImpl is *recursive
 template*.  Why, oh why?
 
 Am I the only one who thinks the current implementation of getopt() is
 really stupid??  Can somebody please talk some sense into me, or point
 out something really obvious that I'm missing?
Affirmative. The implementation is quadratic (including a removal of the option from the string). This is intentional, i.e. understood and acknowledged while I was working on it. Given that the function is only called once per run and with a number of arguments at most in the dozens, by the time its complexity becomes an issue the function is long beyond its charter. This isn't the only instance of quadratic algorithms in Phobos. Quicksort uses an insertion sort - a quadratic algorithm - for 25 elements or fewer. That algorithm may do 600 comparisons in the worst case, and it's potentially that many for each group of 25 elements in a large array. Spending time on improving the speed of getopt is unrecommended. Such work would add no value. Andrei
Mar 23 2018
parent reply Chris Katko <CKATKO GMAIL.COM> writes:
On Saturday, 24 March 2018 at 03:04:41 UTC, Andrei Alexandrescu 
wrote:
 On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually 
 scanned
 *multiple times*, one for each possible option(!).  Besides 
 the bogonity
 that whether or not searchPaths will be set prior to finding 
 -l depends
 on the order of arguments passed to getopt(), this also 
 represents an
 O(n*m) complexity in scanning program arguments, where n = 
 number of
 arguments and m = number of possible options.
 
 And this is not to mention the fact that getoptImpl is 
 *recursive
 template*.  Why, oh why?
 
 Am I the only one who thinks the current implementation of 
 getopt() is
 really stupid??  Can somebody please talk some sense into me, 
 or point
 out something really obvious that I'm missing?
Affirmative. The implementation is quadratic (including a removal of the option from the string). This is intentional, i.e. understood and acknowledged while I was working on it. Given that the function is only called once per run and with a number of arguments at most in the dozens, by the time its complexity becomes an issue the function is long beyond its charter. This isn't the only instance of quadratic algorithms in Phobos. Quicksort uses an insertion sort - a quadratic algorithm - for 25 elements or fewer. That algorithm may do 600 comparisons in the worst case, and it's potentially that many for each group of 25 elements in a large array. Spending time on improving the speed of getopt is unrecommended. Such work would add no value. Andrei
Is there a possibility of improving this function? - While quadratic, for low N, quadratic isn't a big deal. So at what point does quadratic for this function become "a problem"? - If it is a problem, what's stopping someone from improving it? Last question though, is there any kind of list of features, and minor features and fixes that can or need to be done? Perhaps it already exists, but it seems like it'd be great to have a wiki of contribution sites (like this function) that someone could just browse and go "Hey, I know how to do X, maybe I'll take a crack at it." That way, devs who don't have time to improve something "low on the list" could still outsource it in a clear list instead of people who just happen to see it on the forum at the right place right time.
Mar 23 2018
next sibling parent reply Seb <seb wilzba.ch> writes:
On Saturday, 24 March 2018 at 05:55:53 UTC, Chris Katko wrote:
 On Saturday, 24 March 2018 at 03:04:41 UTC, Andrei Alexandrescu 
 wrote:
 On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up 
 the
 disturbing fact that the program's argument list is actually 
 scanned
 *multiple times*, one for each possible option(!).  Besides 
 the bogonity
 that whether or not searchPaths will be set prior to finding 
 -l depends
 on the order of arguments passed to getopt(), this also 
 represents an
 O(n*m) complexity in scanning program arguments, where n = 
 number of
 arguments and m = number of possible options.
 
 And this is not to mention the fact that getoptImpl is 
 *recursive
 template*.  Why, oh why?
 
 Am I the only one who thinks the current implementation of 
 getopt() is
 really stupid??  Can somebody please talk some sense into me, 
 or point
 out something really obvious that I'm missing?
Affirmative. The implementation is quadratic (including a removal of the option from the string). This is intentional, i.e. understood and acknowledged while I was working on it. Given that the function is only called once per run and with a number of arguments at most in the dozens, by the time its complexity becomes an issue the function is long beyond its charter. This isn't the only instance of quadratic algorithms in Phobos. Quicksort uses an insertion sort - a quadratic algorithm - for 25 elements or fewer. That algorithm may do 600 comparisons in the worst case, and it's potentially that many for each group of 25 elements in a large array. Spending time on improving the speed of getopt is unrecommended. Such work would add no value. Andrei
Is there a possibility of improving this function? - While quadratic, for low N, quadratic isn't a big deal. So at what point does quadratic for this function become "a problem"? - If it is a problem, what's stopping someone from improving it? Last question though, is there any kind of list of features, and minor features and fixes that can or need to be done? Perhaps it already exists, but it seems like it'd be great to have a wiki of contribution sites (like this function) that someone could just browse and go "Hey, I know how to do X, maybe I'll take a crack at it." That way, devs who don't have time to improve something "low on the list" could still outsource it in a clear list instead of people who just happen to see it on the forum at the right place right time.
Yes, Bugzilla is full of excellent ideas: https://issues.dlang.org/buglist.cgi?component=phobos&list_id=220544&product=D&resolution=--- There are even some tags like "bootcamp" for someone who is looking to get started: https://issues.dlang.org/buglist.cgi?component=phobos&keywords=bootcamp%2C%20preapproved&keywords_type=anywords&list_id=220545&product=D&query_format=advanced&resolution=--- We have also recently started to experiment with GitHub's new project dashboards. Currently they are tracking projects like improving the documentation, safe-ty, DIP1000 etc.: https://github.com/dlang/phobos/projects DMD has a similar set which is based on Walter's recent post [1] https://github.com/dlang/dmd/projects Last, but not least there's a "Get involved" guide at the wiki: https://wiki.dlang.org/Get_involved As you couldn't find any of these pages, please let us know where you looked first, so that maybe we can make it easier for future people to find this information ;-) [1] https://forum.dlang.org/post/p6oibo$1lmi$1 digitalmars.com
Mar 23 2018
parent rumbu <rumbu rumbu.ro> writes:
On Saturday, 24 March 2018 at 06:04:23 UTC, Seb wrote:

 Yes, Bugzilla is full of excellent ideas:

 https://issues.dlang.org/buglist.cgi?component=phobos&list_id=220544&product=D&resolution=---

 There are even some tags like "bootcamp" for someone who is 
 looking to get started:

 https://issues.dlang.org/buglist.cgi?component=phobos&keywords=bootcamp%2C%20preapproved&keywords_type=anywords&list_id=220545&product=D&query_format=advanced&resolution=---

 We have also recently started to experiment with GitHub's new 
 project dashboards. Currently they are tracking projects like 
 improving the documentation,  safe-ty, DIP1000 etc.:

 https://github.com/dlang/phobos/projects

 DMD has a similar set which is based on Walter's recent post [1]

 https://github.com/dlang/dmd/projects

 Last, but not least there's a "Get involved" guide at the wiki:

 https://wiki.dlang.org/Get_involved

 As you couldn't find any of these pages, please let us know 
 where you looked first, so that maybe we can make it easier for 
 future people to find this information ;-)

 [1] https://forum.dlang.org/post/p6oibo$1lmi$1 digitalmars.com
I saw this kind of "call to arms" post spreading around the forum in the last days. Very nice to have some kind of plan, but before being involved, I really would like to know how this can be done on Windows. And please don't redirect me to the wiki, the information there is clearly outdated and Linux oriented (at least the test stuff) https://forum.dlang.org/post/gxxfmrnezfrlodlhpiwe forum.dlang.org https://forum.dlang.org/post/tulbhulbeqqzofdxevcg forum.dlang.org Thanks.
Mar 23 2018
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/24/2018 01:55 AM, Chris Katko wrote:
 On Saturday, 24 March 2018 at 03:04:41 UTC, Andrei Alexandrescu wrote:
 On 3/23/18 7:29 PM, H. S. Teoh wrote:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually scanned
 *multiple times*, one for each possible option(!).  Besides the bogonity
 that whether or not searchPaths will be set prior to finding -l depends
 on the order of arguments passed to getopt(), this also represents an
 O(n*m) complexity in scanning program arguments, where n = number of
 arguments and m = number of possible options.

 And this is not to mention the fact that getoptImpl is *recursive
 template*.  Why, oh why?

 Am I the only one who thinks the current implementation of getopt() is
 really stupid??  Can somebody please talk some sense into me, or point
 out something really obvious that I'm missing?
Affirmative. The implementation is quadratic (including a removal of the option from the string). This is intentional, i.e. understood and acknowledged while I was working on it. Given that the function is only called once per run and with a number of arguments at most in the dozens, by the time its complexity becomes an issue the function is long beyond its charter. This isn't the only instance of quadratic algorithms in Phobos. Quicksort uses an insertion sort - a quadratic algorithm - for 25 elements or fewer. That algorithm may do 600 comparisons in the worst case, and it's potentially that many for each group of 25 elements in a large array. Spending time on improving the speed of getopt is unrecommended. Such work would add no value. Andrei
Is there a possibility of improving this function?
Most likely it can be improved in terms of features.
   - While quadratic, for low N, quadratic isn't a big deal. So at what 
 point does quadratic for this function become "a problem"?
At a point where a realistic benchmarks shows a need. Without a motivating measurement, making getopt faster would be a waste of time. I mentioned another function: sort. For that, YES, the are ways of improving it. In fact, right after posting my message, I couldn't sleep thinking of a number of ways to improve the short sort part. We have precise benchmarks measuring the number of comparisons and swaps performed by our implementation of sort. Improving its performance lifts a lot of boats - many high-level algorithms use sort as an essential building block. There's a world of difference in impact of the speed of sort vs. speed of getopt. Here are a few ideas for improving the small array sort part (the last mile of sort): * Currently we switch to short sort when the number of elements to sort is smaller than max(32, 1024 / Elem.sizeof). Probably a better choice can be found through experimentation. * Insertion sort does linear search in the already-sorted portion. Probably galloping search would fare better. * Insertion sort starts from the end and grows the sorted portion down. Starting from the middle of the array and growing left and right simultaneously would slash the number of comparisons and swaps in half. * There could be other algorithms better for short arrays, such as specialized versions of heapsort or smoothsort.
   - If it is a problem, what's stopping someone from improving it?
Hopefully very little.
 Last question though, is there any kind of list of features, and minor 
 features and fixes that can or need to be done? Perhaps it already 
 exists, but it seems like it'd be great to have a wiki of contribution 
 sites (like this function) that someone could just browse and go "Hey, I 
 know how to do X, maybe I'll take a crack at it." That way, devs who 
 don't have time to improve something "low on the list" could still 
 outsource it in a clear list instead of people who just happen to see it 
 on the forum at the right place right time.
Seb gave a great answer - thanks! Andrei
Mar 24 2018
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2018 at 08:27:48AM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
 At a point where a realistic benchmarks shows a need. Without a motivating
 measurement, making getopt faster would be a waste of time.
[...] Guys, for crying out loud, my original complaint was not *performance*, but that the (strange) choice of algorithm for getopt resulted in the very counterintuitive behaviour that the order options are processed depends on the order of option declarations rather than the order they were specified on the command-line. This makes it basically impossible to implement certain styles of option processing, such as that employed in the popular ImageMagick suite of tools, where it matters that options are processed in the order they are specified by the user, rather than some arbitrary (to a user who doesn't and shouldn't care to know the code) predetermined order. My complaint about the quadratic algorithm was not in the fact that it's quadratic, but that it exhibited this strange (and annoying!) behaviour, especially since the saner (IMO) non-quadratic algorithm would have been the expected choice in the first place, that would *not* have had this problem. It felt almost like we went out of our way just to make things counterintuitive, with slowness added as a cherry on top. T -- "You are a very disagreeable person." "NO."
Mar 24 2018
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 03/24/2018 09:36 AM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 08:27:48AM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 At a point where a realistic benchmarks shows a need. Without a motivating
 measurement, making getopt faster would be a waste of time.
[...] Guys, for crying out loud, my original complaint was not *performance*, but that the (strange) choice of algorithm for getopt resulted in the very counterintuitive behaviour that the order options are processed depends on the order of option declarations rather than the order they were specified on the command-line.
I'd have a difficult time interpreting the following as not performance-related:
 Well, looking at the implementation of std.getopt turned up the
 disturbing fact that the program's argument list is actually scanned
 *multiple times*, one for each possible option(!).  Besides the bogonity
 that whether or not searchPaths will be set prior to finding -l depends
 on the order of arguments passed to getopt(), this also represents an
 O(n*m) complexity in scanning program arguments, where n = number of
 arguments and m = number of possible options.
Anyhow. Right now the order of processing is the same as the lexical order in which flags are passed to getopt. There may be use cases for which that's the more desirable way to go about things, so if you author a PR to change the order you'd need to build an argument on why command-line order is better. FWIW the traditional POSIX doctrine makes behavior of flags independent of their order, which would imply the current choice is more natural.
 This makes it basically impossible
 to implement certain styles of option processing, such as that employed
 in the popular ImageMagick suite of tools, where it matters that options
 are processed in the order they are specified by the user, rather than
 some arbitrary (to a user who doesn't and shouldn't care to know the
 code) predetermined order.
This is an exaggeration. Yes you can't process with lambdas. You can always collect options first, process after. This is a well-supported use case.
 My complaint about the quadratic algorithm was not in the fact that it's
 quadratic, but that it exhibited this strange (and annoying!) behaviour,
 especially since the saner (IMO) non-quadratic algorithm would have been
 the expected choice in the first place, that would *not* have had this
 problem. It felt almost like we went out of our way just to make things
 counterintuitive, with slowness added as a cherry on top.
I want to be convinced. I think you'd need to build a better case on why you consider one behavior intuitive and the other counterintuitive. Andrei
Mar 24 2018
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
[...]
 Anyhow. Right now the order of processing is the same as the lexical
 order in which flags are passed to getopt. There may be use cases for
 which that's the more desirable way to go about things, so if you
 author a PR to change the order you'd need to build an argument on why
 command-line order is better. FWIW the traditional POSIX doctrine
 makes behavior of flags independent of their order, which would imply
 the current choice is more natural.
So what about making this configurable? And documented? Last time I checked, this was not clearly stated in the docs. [...]
 My complaint about the quadratic algorithm was not in the fact that
 it's quadratic, but that it exhibited this strange (and annoying!)
 behaviour, especially since the saner (IMO) non-quadratic algorithm
 would have been the expected choice in the first place, that would
 *not* have had this problem. It felt almost like we went out of our
 way just to make things counterintuitive, with slowness added as a
 cherry on top.
I want to be convinced. I think you'd need to build a better case on why you consider one behavior intuitive and the other counterintuitive.
[...] Honestly, I've wasted far too much time writing about this on the forum already. In the time it took to argue about this, I could have already written my own version of getopt that does what I want, instead of fighting with strange design decisions in Phobos. I'm not going to waste any more time arguing about this, since, after all, it *is* "just" getopt(). This was not the only issue I struggled with, as std.getopt has other design differences incompatible with Posix getopt() that makes it hard to support the original semantics of a previous C++ project ported to D. Yes, I could have used Posix getopt() from D, but that requires some ugly shim code, tons of toStringz/fromStringz, doesn't take advantage of things like automatic enum conversions, etc., which sux given that we're in D, not C++. And given the defensiveness surrounding std.getopt, my conclusion can only be: dump std.getopt, roll my own. It's sad, since in general Phobos design tends to be superior to its C++ counterparts. But we then have warts like std.getopt that people refuse to acknowledge is a problem. So be it. T -- In a world without fences, who needs Windows and Gates? -- Christian Surchi
Mar 24 2018
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/24/18 12:59 PM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 Anyhow. Right now the order of processing is the same as the lexical
 order in which flags are passed to getopt. There may be use cases for
 which that's the more desirable way to go about things, so if you
 author a PR to change the order you'd need to build an argument on why
 command-line order is better. FWIW the traditional POSIX doctrine
 makes behavior of flags independent of their order, which would imply
 the current choice is more natural.
So what about making this configurable?
That'd be great. I'm thinking something like an option std.getopt.config.commandLineOrder. Must be first option specified right after arguments. Sounds good?
Mar 24 2018
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2018 at 05:24:28PM -0400, Andrei Alexandrescu via Digitalmars-d
wrote:
 On 3/24/18 12:59 PM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via
Digitalmars-d wrote:
 [...]
 Anyhow. Right now the order of processing is the same as the
 lexical order in which flags are passed to getopt. There may be
 use cases for which that's the more desirable way to go about
 things, so if you author a PR to change the order you'd need to
 build an argument on why command-line order is better. FWIW the
 traditional POSIX doctrine makes behavior of flags independent of
 their order, which would imply the current choice is more natural.
So what about making this configurable?
That'd be great. I'm thinking something like an option std.getopt.config.commandLineOrder. Must be first option specified right after arguments. Sounds good?
Great! Not so sure how easy it is to implement while supporting everything else, though, given the current structure of the code. T -- Time flies like an arrow. Fruit flies like a banana.
Mar 24 2018
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sat, 24 Mar 2018 17:24:28 -0400 schrieb Andrei Alexandrescu:

 On 3/24/18 12:59 PM, H. S. Teoh wrote:
 On Sat, Mar 24, 2018 at 12:11:18PM -0400, Andrei Alexandrescu via
 Digitalmars-d wrote:
 [...]
 Anyhow. Right now the order of processing is the same as the lexical
 order in which flags are passed to getopt. There may be use cases for
 which that's the more desirable way to go about things, so if you
 author a PR to change the order you'd need to build an argument on why
 command-line order is better. FWIW the traditional POSIX doctrine
 makes behavior of flags independent of their order, which would imply
 the current choice is more natural.
So what about making this configurable?
That'd be great. I'm thinking something like an option std.getopt.config.commandLineOrder. Must be first option specified right after arguments. Sounds good?
I don't really understand why you want to this keep lexical order functionality. There's a well defined use case for command line order: Allowing users to write commands in a natural, left-to-right style, where options on the right are more specific: systemctl status -l ... I've never heard of any use case where the lexical order of the arguments passed to getopt matters for parsing user supplied command arguments. Is there any use case for this? I thought the only reason we have this lexical order parsing is because it's simpler to implement. But if we'll get the non-quadratic command-line order implementation there's no reason to keep and maintain the quadratic implementation. -- Johannes
Mar 25 2018
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/25/18 10:22 AM, Johannes Pfau wrote:
 I don't really understand why you want to this keep lexical order
 functionality.
I don't want. I think others will, once their programs depending on the current semantics will have trouble.
Mar 25 2018
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Sunday, 25 March 2018 at 14:25:49 UTC, Andrei Alexandrescu 
wrote:
 I don't want. I think others will, once their programs 
 depending on the current semantics will have trouble.
The current semantics are not documented, so any program that relies on them is foolish anyway. Like I said in my code, I read options and run them in separate orders since I specifically want control - I imagine most everyone else does too, since otherwise you are depending on underspecified behavior and liable to break without notice.
Mar 26 2018
prev sibling parent reply Abdulhaq <alynch4047 gmail.com> writes:
On Saturday, 24 March 2018 at 21:24:28 UTC, Andrei Alexandrescu 
wrote:
 That'd be great. I'm thinking something like an option 
 std.getopt.config.commandLineOrder. Must be first option 
 specified right after arguments. Sounds good?
I thought this was a clever joke, but everyone is taking it seriously ?! "When running mygreatprog.exe, always run with --command-line-order CommandLine as the first command line option, otherwise mygreatprog.exe may misinterpret the command line"
Mar 25 2018
parent reply Abdulhaq <alynch4047 gmail.com> writes:
On Sunday, 25 March 2018 at 14:46:23 UTC, Abdulhaq wrote:
 On Saturday, 24 March 2018 at 21:24:28 UTC, Andrei Alexandrescu 
 wrote:
 That'd be great. I'm thinking something like an option 
 std.getopt.config.commandLineOrder. Must be first option 
 specified right after arguments. Sounds good?
I thought this was a clever joke, but everyone is taking it seriously ?! "When running mygreatprog.exe, always run with --command-line-order CommandLine as the first command line option, otherwise mygreatprog.exe may misinterpret the command line"
Oops sorry to reply to myself, I realise my mistake now :-)
Mar 25 2018
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/25/18 10:48 AM, Abdulhaq wrote:
 On Sunday, 25 March 2018 at 14:46:23 UTC, Abdulhaq wrote:
 On Saturday, 24 March 2018 at 21:24:28 UTC, Andrei Alexandrescu wrote:
 That'd be great. I'm thinking something like an option 
 std.getopt.config.commandLineOrder. Must be first option specified 
 right after arguments. Sounds good?
I thought this was a clever joke, but everyone is taking it seriously ?! "When running mygreatprog.exe, always run with --command-line-order CommandLine as the first command line option, otherwise mygreatprog.exe may misinterpret the command line"
Oops sorry to reply to myself, I realise my mistake now :-)
To purge thy mistake: implement :o).
Mar 25 2018
prev sibling next sibling parent Jon Degenhardt <jond noreply.com> writes:
On Saturday, 24 March 2018 at 16:11:18 UTC, Andrei Alexandrescu 
wrote:
 Anyhow. Right now the order of processing is the same as the 
 lexical order in which flags are passed to getopt. There may be 
 use cases for which that's the more desirable way to go about 
 things, so if you author a PR to change the order you'd need to 
 build an argument on why command-line order is better. FWIW the 
 traditional POSIX doctrine makes behavior of flags independent 
 of their order, which would imply the current choice is more 
 natural.
Several of the TSV tools I built rely on command-line order. There is an enhancement request here: https://issues.dlang.org/show_bug.cgi?id=16539. A few of the tools use a paradigm where the user is entering a series instructions on the command line, and there are times when the user entered order matters. Two general cases: * Display/output order - The tool produces delimited output, and the user wants to control the order. The order of command line options determines the order. * Short-circuiting - tsv-filter in particular allows numeric tests like less-than, but also allow the user to short-circuit the test by testing if the data contains a valid number prior to making the numeric test. This is done by evaluating the command line arguments in left-to-right order. Short-circuiting is supported the Unix `find` utility. I have used this approach for CLI tools I've written in other languages. Perl's Getopt::Long processes args in command-line, so it supports this. I considered submitting a PR to getopt to change this, but decided against it. The approach used looks like it is central to the design, and changing it in a backward compatible way would be a meaningful undertaking. Instead I wrote a cover to getopt that processes arguments in command-line order. It is here: https://github.com/eBay/tsv-utils-dlang/blob/master/common/s c/getopt_inorder.d. It handles most of what std.getopt handles. The TSV utilities documentation should help illustrate these cases. tsv-filter use short circuiting: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#t v-filter-reference. Look for "Short-circuiting expressions" toward the bottom of the section. tsv-summarize obeys the command-line order for output/display. See: https://github.com/eBay/tsv-utils-dlang/blob/master/docs/ToolReference.md#tsv-summarize-reference. There's one other general limitation I encountered with the current compile-time approach to command-line argument processing. I couldn't find a clean way to allow it to be extended in a plug-in manner. In particular, the original goal for the tsv-summarize tool was to allow users to create custom operators. The tool has a fair number of built-in operators, like median, sum, min, max, etc. Each of these operators has a getopt arg invoking it, eg. '--median', '--sum', etc. However, it is common for people to have custom analysis needs, so allowing extension of the set would be quite useful. The code is setup to allow this. People would clone the repo, write their own operator, placed in a separate file they maintain, and rebuild. However, I couldn't figure out a clean way to allow additions to command line argument set. There may be a reasonable way and I just couldn't find it, but my current thinking is that I need to write my own command line argument handler to support this idea. I think handling command line argument processing at run-time would make this simpler, at the cost loosing some compile-time validation. --Jon
Mar 24 2018
prev sibling next sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Saturday, March 24, 2018 09:59:44 H. S. Teoh via Digitalmars-d wrote:
 And given the defensiveness surrounding std.getopt, my conclusion can
 only be: dump std.getopt, roll my own.  It's sad, since in general
 Phobos design tends to be superior to its C++ counterparts.  But we then
 have warts like std.getopt that people refuse to acknowledge is a
 problem.  So be it.
I think that there are at least a couple alternatives to std.getopt on code.dlang.org if you want alternatives. Personally, the only complaints I've had with std.getopt is that bundling isn't the default and that it's not always easy to figure out whether an argument has been set or not. But at least the bundling can be configured, and getopt can probably be improved to work with Nullable so that it'll be easier to figure out whether an argument has been set. As for defensiveness, I'm not quite sure what you're referring to. The main point was that given how often getopt gets called in a program, improving its Big-O complexity isn't worth it, but there have been a number of improvements to getopt over the years, so it's not like we're not allowed to improve it. It's just that improving its Big-O complexity is kind of pointless. In any case, as Andrei said, if a new option can be added to fix your use case, then that shouldn't be a problem, though I have no clue how much of a pain that will be to implement, particularly since std.getopt isn't exactly simple. - Jonathan M Davis
Mar 24 2018
parent reply Seb <seb wilzba.ch> writes:
On Sunday, 25 March 2018 at 04:30:31 UTC, Jonathan M Davis wrote:
 On Saturday, March 24, 2018 09:59:44 H. S. Teoh via 
 Digitalmars-d wrote:
 And given the defensiveness surrounding std.getopt, my 
 conclusion can only be: dump std.getopt, roll my own.  It's 
 sad, since in general Phobos design tends to be superior to
Yeah I have "dumb XYZ, roll my own" experience often too. As there are already many big libraries like `arsd` or `ae` out there, I don't think I'm the only one with these feeling. I wonder if someone ever tries to fork/reboot Phobos with all the goodies, but without the legacy cruft like auto-decoding and similar friends whose breaking changes can't be made.
 its C++ counterparts.  But we then have warts like std.getopt 
 that people refuse to acknowledge is a problem.  So be it.
I think that there are at least a couple alternatives to std.getopt on code.dlang.org if you want alternatives.
Yes, two good ones are: https://blog.thecybershadow.net/2014/08/05/ae-utils-funopt http://code.dlang.org/packages/darg
 Personally, the only complaints I've had with std.getopt is
Hehe I like many things about std.getopt, but it's not perfect either. A few examples: - I often just want to map CLI arguments to a config object where using UDAs would more natural and less boilerplate struct Config { option("c|compiler") string compiler; } Now with the rejected/postponeed __traits(documentation) the ddoc help text could be automatically read and put into the auto-generated CLI help. - I don't like to manually check for .helpWanted Imho I constantly find myself doing this: if (helpInformation.helpWanted) { `DDoc wrapper All unknown options are passed to the compiler. ./ddoc <file>... `.defaultGetoptPrinter(helpInformation.options); return 1; } I would have preferred this being the default behavior or at least the default behavior if a help text string is explicitly provided e.g. like: getopt(`My program ./program ...`, args, ...); or maybe with sth. like `.withHelp("")` - setting shared configs doesn't work I know, I could use a TLS config or use an atomicOp or synchronized assignment to set it, but often casting is easier and that's rather ugly: https://github.com/dlang/dlang.org/blob/master/dspec_tester.d#L101 - in theory getopt should be safe and use ref It just does a bit of string manipulation, but it looks like we have to wait until DIP1000 for this: https://github.com/dlang/phobos/pull/6281 Also similarly to std.stdio.read or std.format.formattedRead there's no need to use pointers, D's safe ref would have worked too. Now, it looks like this change can't be made anymore as it would be a breaking one due to ambiguities. - it would be really cool to support generating zsh/bash completions files This is the last point on my list as it's not really a limitation of std.getopt and GetoptResult should be enough for this, but it looks like no one bothered enough to write a zshGetoptPrinter so far.
 getopt can probably be improved to work with Nullable so that 
 it'll be easier to figure out whether an argument has been set.
Yes, supporting Nullable would really cool!
Mar 24 2018
next sibling parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Sunday, 25 March 2018 at 06:58:50 UTC, Seb wrote:
 I think that there are at least a couple alternatives to 
 std.getopt on code.dlang.org if you want alternatives.
Yes, two good ones are: https://blog.thecybershadow.net/2014/08/05/ae-utils-funopt
funopt is based on getopt underneath, so this issue still applies to it, sorry! Well, funopt translates options to function arguments, so there's no way to specify a delegate anyway, but at least the performance aspect applies.
Mar 25 2018
prev sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Sunday, 25 March 2018 at 06:58:50 UTC, Seb wrote:
 Yeah I have "dumb XYZ, roll my own" experience often too.
 As there are already many big libraries like `arsd` or `ae` out 
 there, I don't think I'm the only one with these feeling.
In my case, there's very little overlap with what Phobos offers. And in cases where there are, it is usually either 1) built on top of phobos (e.g. my jsvar actually depends on std.json), and/or 2) older than the Phobos offering, often significantly so. Well, there's also a few explicitly redone functions like `to` hidden inside my color.d, but that's a dependency bloat thing and less relevant now that phobos is getting its own messy import web under control (this came when importing the most trivial phobos module would triple the build time and double the executable size of all my gui apps, so the gui module tree in my libs were all phobos free, and that did mean a few trivial reimplementations but to!string(int) is like a five liner sooo easy trade there. But now the situation is much better.) But anyway, what Phobos does, it tends to do reasonably well in my view (with a couple glaring exceptions). I kinda like std.getopt. It isn't perfect and I could do better... but it is there and it is good enough, so I defend it. Phobos just doesn't even attempt most of what I need, so I also have a LOT of reusable add-on code that i call the arsd repo which is kinda part two of my personal D standard library :) .
Mar 26 2018
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2018 at 10:30:31PM -0600, Jonathan M Davis via Digitalmars-d
wrote:
 On Saturday, March 24, 2018 09:59:44 H. S. Teoh via Digitalmars-d wrote:
 And given the defensiveness surrounding std.getopt, my conclusion
 can only be: dump std.getopt, roll my own.  It's sad, since in
 general Phobos design tends to be superior to its C++ counterparts.
 But we then have warts like std.getopt that people refuse to
 acknowledge is a problem.  So be it.
[...]
 As for defensiveness, I'm not quite sure what you're referring to. The
 main point was that given how often getopt gets called in a program,
 improving its Big-O complexity isn't worth it, but there have been a
 number of improvements to getopt over the years, so it's not like
 we're not allowed to improve it. It's just that improving its Big-O
 complexity is kind of pointless. In any case, as Andrei said, if a new
 option can be added to fix your use case, then that shouldn't be a
 problem, though I have no clue how much of a pain that will be to
 implement, particularly since std.getopt isn't exactly simple.
[...] OK, the part about defensiveness may be just my overreaction. I apologize. But yeah, I glanced at the code, and don't see any easy way to implement what Andrei agreed with. It's just too much work for something I could just write for myself in a much shorter time. I guess I'll just log an enhancement request in bugzilla and leave it at that. T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Mar 24 2018
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2018 at 10:05:36PM -0700, H. S. Teoh via Digitalmars-d wrote:
[...]
 OK, the part about defensiveness may be just my overreaction. I
 apologize.  But yeah, I glanced at the code, and don't see any easy
 way to implement what Andrei agreed with. It's just too much work for
 something I could just write for myself in a much shorter time. I
 guess I'll just log an enhancement request in bugzilla and leave it at
 that.
[...] Turns out, there's already been an issue for this, filed 2 years ago: https://issues.dlang.org/show_bug.cgi?id=16539 T -- People say I'm arrogant, and I'm proud of it.
Mar 25 2018
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, and minor
features 
 and fixes that can or need to be done? Perhaps it already exists,
And here it is: https://issues.dlang.org/
Mar 25 2018
parent reply Rubn <where is.this> writes:
On Sunday, 25 March 2018 at 09:27:31 UTC, Walter Bright wrote:
 On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, 
 and minor features and fixes that can or need to be done? 
 Perhaps it already exists,
And here it is: https://issues.dlang.org/
Not a very comprehensive list. Virtually all of those issues have no comment on them. If it's a feature request you might as assume it requires a DIP cause there's no reason to otherwise waste your time implementing the feature. Oddly enough there's almost no other way to get anyone's attention of whether a feature requires a DIP or not unless there's a pull request for the feature. So if you want a feature you almost have to risk wasting your time implementing it. It's not a very good system, but someone throws up some stats about how many issues get solved/pull requests get created per month and they conclude that it's working fine.
Mar 25 2018
next sibling parent Seb <seb wilzba.ch> writes:
On Sunday, 25 March 2018 at 13:23:04 UTC, Rubn wrote:
 On Sunday, 25 March 2018 at 09:27:31 UTC, Walter Bright wrote:
 On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, 
 and minor features and fixes that can or need to be done? 
 Perhaps it already exists,
And here it is: https://issues.dlang.org/
Not a very comprehensive list. Virtually all of those issues have no comment on them. If it's a feature request you might as assume it requires a DIP cause there's no reason to otherwise waste your time implementing the feature.
Well, first off - most of these issues are bug reports and would obviously be pulled if fixed. Also, bigger improvements to Phobos don't require a DIP - just Andrei's approval. There have been many discussions on this (a recent one: https://forum.dlang.org/post/mailman.1298.1521583794.3374.digitalma s-d puremagic.com), but in short it's going to stay like this, but you can easily shoot Andrei a mail _before_ doing something bigger at Phobos. Now regarding language features, the DIP process has been revamped: https://forum.dlang.org/post/p95hjs$1nf$1 digitalmars.com
 Oddly enough there's almost no other way to get anyone's 
 attention of whether a feature requires a DIP or not unless 
 there's a pull request for the feature. So if you want a 
 feature you almost have to risk wasting your time implementing 
 it.
Improvements to the language will require a DIP. Sometimes (like e.g. for a new trait) it's possible to get direct approval by Walter/Andrei on GitHub, but the rule of thumb is that it needs a DIP. In doubt, you can discuss a new feature on Slack, IRC or this NG here.
 It's not a very good system, but someone throws up some stats 
 about how many issues get solved/pull requests get created per 
 month and they conclude that it's working fine.
You aren't alone: https://github.com/wilzbach/state-of-d-2018/blob/master/09c:%20What%2C%20if%20anything%2C%20do%20you%20dislike%20about%20D's%20issue%20process%3F https://github.com/wilzbach/state-of-d-2018/blob/master/09d:%20What%2C%20if%20anything%2C%20has%20prevented%20you%20from%20opening%20an%20issue%3F There are many improvements hopefully coming up to issues.dlang.org in the near future: https://forum.dlang.org/post/tneyowfjewrlrtnqsuvd forum.dlang.org If you have more specific ideas on what could be done to improve issues.dlang.org, share them on #dlang_org in Slack, here or in Bugzilla. Note: - Switch to GH issues is a common request and while I also fought for that one (I even tested an automatic migration to GH), there are quite some downsides with GH issue tracker and at the moment the consensus is to give Mozilla's fork of Bugzilla a fair try - "there are almost no comments" on issues isn't actionable, but a system/idea to improve this would be.
Mar 25 2018
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/25/2018 6:23 AM, Rubn wrote:
 On Sunday, 25 March 2018 at 09:27:31 UTC, Walter Bright wrote:
 On 3/23/2018 10:55 PM, Chris Katko wrote:
 Last question though, is there any kind of list of features, and minor 
 features and fixes that can or need to be done? Perhaps it already exists,
And here it is: https://issues.dlang.org/
Virtually all of those issues have no comment on them.
Hence there's plenty of "need to be done" contributions to make! People often make very valuable contributions to issues in the comments by: 1. producing a reduced test case (the smaller the test case, the easier it is to track down) 2. finding the cause of the bug 3. finding the pull request that introduced the bug 4. connecting to related work 5. any other information helpful in resolving it
Mar 25 2018
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 23 March 2018 at 23:29:48 UTC, H. S. Teoh wrote:
 I just ran into this seemingly small problem:
The way I'd do this is to only use getopt to build the lists, then actually process them externally. (lol adding another loop) string[] searchPaths; string[] files; getopt(args, "l", &files, "I", &searchPaths ); foreach(file; files) openFile(file); then it is clear what order your operations are done in anyway, and you have a chance to perhaps report bad syntax before actually doing any real work. Wouldn't it be weird for example if $ cat foo.d --help spat out the contents followed by the help?
Mar 24 2018
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Mar 24, 2018 at 01:43:10PM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Friday, 23 March 2018 at 23:29:48 UTC, H. S. Teoh wrote:
 I just ran into this seemingly small problem:
The way I'd do this is to only use getopt to build the lists, then actually process them externally. (lol adding another loop) string[] searchPaths; string[] files; getopt(args, "l", &files, "I", &searchPaths ); foreach(file; files) openFile(file); then it is clear what order your operations are done in anyway, and you have a chance to perhaps report bad syntax before actually doing any real work. Wouldn't it be weird for example if $ cat foo.d --help spat out the contents followed by the help?
Touche. This uglifies the code a bit, but meh. It's just main(), no biggie. T -- INTEL = Only half of "intelligence".
Mar 24 2018