digitalmars.D - Next in Review Queue: The New std.path

dsimcha (31/31) Jul 14 2011 Lars Kyllingstad's new and improved std.path module for Phobos is the

dsimcha (20/51) Jul 14 2011 ...And now I'll do the honors and be the first to review. Overall this

Jonathan M Davis (12/17) Jul 14 2011 A documentation note for it can and should be added to each function say...
Nick Sabalausky (4/16) Jul 15 2011 Isn't that what -d is for?

Jonathan M Davis (31/41) Jul 15 2011 The way that deprecation is supposed to work is in three stages:

Lars T. Kyllingstad (20/40) Jul 15 2011 I agree. (I didn't really want to keep them, but I wasn't sure whether

Nick Sabalausky (16/24) Jul 15 2011 I totally understand that reasoning, but as I pointed out in the OP of t...

Lars T. Kyllingstad (5/33) Jul 15 2011 I didn't notice your CTFE-ability thread until now, because I was away o...

Andrej Mitrovic (2/2) Jul 14 2011 Lars, is normalize basically that feature request (toNativePath) I

Lars T. Kyllingstad (13/15) Jul 15 2011 Oops, sorry. Because your feature request came in on a different channe...

Andrej Mitrovic (8/8) Jul 15 2011 The thing although Windows works with forward slashes some tools might
Andrej Mitrovic (1/1) Jul 15 2011 The thing is*

Andrej Mitrovic (2/2) Jul 14 2011 All functions have nice names except fcmp, why the abbreviation?

Nick Sabalausky (8/11) Jul 15 2011 The docs say that it does nothing on Windows: "For Windows, expandTilde(...

Mafi (5/17) Jul 15 2011 What about longname becoming longna~1 on dos/emulators etc. Expanding

Lars T. Kyllingstad (5/9) Jul 15 2011 That too, requires information about the state of the filesystem. For

Lars T. Kyllingstad (9/12) Jul 15 2011 For consistency with std.algorithm.cmp and std.string.icmp, on which it

Nick Sabalausky (15/26) Jul 15 2011 I'm happy with it having the obscure name "fcmp" because *all* it is is

Steven Schveighoffer (9/46) Jul 15 2011 The typical thing is to use the environment variable USERPROFILE, but

Nick Sabalausky (14/28) Jul 15 2011 No, like I've said, all the possible choices are frequently wrong, and t...

Steven Schveighoffer (21/59) Jul 18 2011 Certainly not having that function on Windows will encourage them to

David Nadlinger (6/11) Jul 18 2011 Just a quick idea: Maybe we could provide a set of standard directories
torhu (11/30) Jul 18 2011 Wouldn't that tend to cause the same issue? They use tilde on Linux,

Steven Schveighoffer (19/61) Jul 18 2011 So without expansion they get some local directory named ~ created (~ is...

Andrej Mitrovic (3/6) Jul 15 2011 %UserProfile%
Andrej Mitrovic (3/3) Jul 15 2011 Also, you can change the location of your home directory, although

bearophile (21/21) Jul 14 2011 Just few comments to the source code. I have not used it yet.

Lars T. Kyllingstad (17/41) Jul 15 2011 I find ssize_t (which is standard on POSIX) to be much more descriptive

KennyTM~ (3/10) Jul 15 2011 Given that immutable( const(char) ) == immutable(char), I think the

Jonathan M Davis (5/18) Jul 15 2011 I'd still put the Unqual in there. Perhaps it's due to compiler bugs, bu...

KennyTM~ (14/32) Jul 15 2011 OK. But I think you should file the compiler bug :) From what I see,

Lars T. Kyllingstad (6/43) Jul 15 2011 True, but this doesn't apply to the present case. Since the parameters

KennyTM~ (18/61) Jul 15 2011 Even if `typeof(path)` becomes `const(string)`, C1 is still an

bearophile (5/11) Jul 14 2011 Adding pure and nothrow to functions in other Phobos modules is a good t...

KennyTM~ (31/42) Jul 15 2011 The 'enforce' problem is simple. It could be easier replaced by the

KennyTM~ (19/19) Jul 15 2011 IANAL, but there should be mention of the Boost license somewhere.

Nick Sabalausky (4/8) Jul 15 2011 Fun!
Jacob Carlborg (4/23) Jul 15 2011 It's possible to do that on Mac OS X as well.
Lars T. Kyllingstad (14/35) Jul 15 2011 Mainly because I wasn't aware of its existence. ;) I'll fix.

Nick Sabalausky (39/41) Jul 15 2011 Yay!

Lars T. Kyllingstad (15/69) Jul 15 2011 Thanks! :)

Nick Sabalausky (13/25) Jul 15 2011 Oh, yea, I didn't mean to imply that there shouldn't be a resolveSymLink...

Lars T. Kyllingstad (3/37) Jul 15 2011 Fair enough. I'll think of a better name and fix it.

Nick Sabalausky (4/4) Jul 15 2011 Since the details of many of these functions have changed, is it really ...

Jonathan M Davis (5/9) Jul 15 2011 If the behavior has changed, then the old functions need to be there.

Lars T. Kyllingstad (3/13) Jul 15 2011 I'll put them back in.

Vladimir Panteleev (10/12) Jul 15 2011 Three notes:

Andrej Mitrovic (2/4) Jul 15 2011 +1.
Lars T. Kyllingstad (7/18) Jul 15 2011 1. I'll fix.

Vladimir Panteleev (16/19) Jul 15 2011 The shortest version is most useful. The idea is to do this:

Lars T. Kyllingstad (4/14) Jul 15 2011 Thinking some more about it, this algorithm should be easy to implement

Nick Sabalausky (10/19) Jul 15 2011 I don't know if you're talking about actual unittests or documentation, ...

Lars T. Kyllingstad (8/33) Jul 15 2011 So what you guys are saying is, basically, that it matters whether a

Vladimir Panteleev (8/27) Jul 15 2011 My main argument for this is that this is the *documented* behavior of

Jonathan M Davis (97/97) Jul 15 2011 Names

Nick Sabalausky (22/32) Jul 15 2011 I really, really, REALLY, *REALLY* think it's a big mistake to categoric...

Jonathan M Davis (11/48) Jul 15 2011 Well, like I said, I'm not sure that naming it joinPath instead of join ...
Andrej Mitrovic (4/4) Jul 16 2011 I was quite fond of it being renamed to joinPath, because when I'm

Lars T. Kyllingstad (40/149) Jul 16 2011 This has been discussed before, and even voted upon. While the voting

Jonathan M Davis (51/202) Jul 16 2011 Well, looking at that, it looks like it was pretty much a toss up on mos...

Lars T. Kyllingstad (6/18) Jul 17 2011 That function (expandTilde) and its documentation is from the current

Andrej Mitrovic (2/2) Jul 15 2011 I'd rather have `extension` renamed to getExt. `ext` is used way too
Jonathan M Davis (13/15) Jul 15 2011 extension is a bit iffy, since it's just the one word. So, I'm not as co...
torhu (17/26) Jul 17 2011 Looks nice and clean, both docs and code! I like modules that are not

Lars T. Kyllingstad (14/46) Jul 18 2011 Thanks!

Steven Schveighoffer (14/41) Jul 18 2011 No, it has to duplicate. Part of idup is saying "I want an immutable

Lars T. Kyllingstad (8/60) Jul 18 2011 The specific cases in question are templated functions, where I don't

Daniel Murphy (3/9) Jul 18 2011 Doesn't std.conv.to do this for you?

Lars T. Kyllingstad (4/17) Jul 18 2011 It probably does! I hadn't thought about that, but I'll check it out

Jonathan M Davis (6/24) Jul 18 2011 It does. That's why you should generally use std.conv.to instead of dup ...

torhu (10/27) Jul 18 2011 Sorry, don't know why I thought of COW in this case. Don't heap

Lars T. Kyllingstad (6/11) Jul 20 2011 See my answer to Steve. I'm considering expanding joinPath's charter a

Ellery Newcomer (5/5) Jul 18 2011 Looks nice.

Lars T. Kyllingstad (3/9) Jul 20 2011 I'll look into it. :)

dsimcha <dsimcha yahoo.com> writes:

Lars Kyllingstad's new and improved std.path module for Phobos is the 
next item up in the review queue.  I've volunteered to be the review 
manager.  Barring any objections, the review starts now and ends at the 
ends two weeks from now on July 28.  This will be followed by a week of 
voting, ending August 4th.

Code:
https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

Docs:
http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

Author's comments -- please read before reviewing:

First of all, note that I have rewritten almost everything from scratch.
Mainly, this is because there are a lot of corner cases which aren't
handled correctly by the current std.path, and in the end rewriting
turned out to be easier than updating.  (Check out the unittests; they
cover all the cases I could think of.)  The only functions I have kept
unchanged from the current std.path are pathCharMatch (formerly
fncharmatch), glob (formerly fnmatch) and expandTilde.

Secondly, I hope you will find the new function names to be both more
descriptive, more consistent and more in line with current Phobos
conventions.  They are the result of a (rather informal) round of
discussion and voting in this forum a few months ago (which I hope you
will consider before suggesting new names):

http://www.digitalmars.com/d/archives/digitalmars/D/Proposal_for_std.path_replacement_131121.html

Thirdly, I have applied  safe, pure and nothrow wherever possible.
There were cases where this was not possible, either due to bugs in DMD
(specifically, 5304, 5700 and 5798) or due to functions in other Phobos
modules not being properly marked with these attributes.  I have marked
these with //TODO comments and will fix them as soon as possible.

And finally, due to the deprecated: block at the bottom of the module,
this version should be backwards compatible with the current std.path
for a while yet.

Jul 14 2011

dsimcha <dsimcha yahoo.com> writes:

...And now I'll do the honors and be the first to review. Overall this
looks solid and well documented but I have a few nitpicks and questions

1. Do we really need currentDirSymbol and parentDirSymbol? AFAIK
they're standard across all operating systems that anyone cares about
and everyone can easily remember them. I have never used the
equivalents in the old std.path and have never missed them.

2. There's really no need to always allocate a new array on things like
setExtension. You could just append the extension in the case where the
extension doesn't already exist and the string is immutable. I'm not
sure this is worth the extra complexity, though, given that setExtension
will probably never be used in perf-critical code.

3. All the stuff from the old std.path should be "scheduled for
deprecation". Deprecating stuff breaks people's build processes and
should only be done after adequate warning. I don't know a good way to
implement this for enums and aliases, though, which I guess begs the
question of whether DMD should support soft deprecation.

4. This module might be useful at compile time. As pointed out on this
forum, Phobos should eventually include tests for CTFE-ability. It
would be nice if std.path had a few CTFE unit tests sprinkled in.

On 7/14/2011 8:20 PM, dsimcha wrote:
Lars Kyllingstad's new and improved std.path module for Phobos is the
next item up in the review queue. I've volunteered to be the review
manager. Barring any objections, the review starts now and ends at the
ends two weeks from now on July 28. This will be followed by a week of
voting, ending August 4th.

Code:
https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

Docs:
http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

Author's comments -- please read before reviewing:

First of all, note that I have rewritten almost everything from scratch.
Mainly, this is because there are a lot of corner cases which aren't
handled correctly by the current std.path, and in the end rewriting
turned out to be easier than updating. (Check out the unittests; they
cover all the cases I could think of.) The only functions I have kept
unchanged from the current std.path are pathCharMatch (formerly
fncharmatch), glob (formerly fnmatch) and expandTilde.

Secondly, I hope you will find the new function names to be both more
descriptive, more consistent and more in line with current Phobos
conventions. They are the result of a (rather informal) round of
discussion and voting in this forum a few months ago (which I hope you
will consider before suggesting new names):

http://www.digitalmars.com/d/archives/digitalmars/D/Proposal_for_std.path_replacement_131121.html

Thirdly, I have applied safe, pure and nothrow wherever possible.
There were cases where this was not possible, either due to bugs in DMD
(specifically, 5304, 5700 and 5798) or due to functions in other Phobos
modules not being properly marked with these attributes. I have marked
these with //TODO comments and will fix them as soon as possible.

And finally, due to the deprecated: block at the bottom of the module,
this version should be backwards compatible with the current std.path
for a while yet.

Jul 14 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On 2011-07-14 17:33, dsimcha wrote:
 3. All the stuff from the old std.path should be "scheduled for
 deprecation". Deprecating stuff breaks people's build processes and
 should only be done after adequate warning. I don't know a good way to
 implement this for enums and aliases, though, which I guess begs the
 question of whether DMD should support soft deprecation.

A documentation note for it can and should be added to each function saying 
that it's scheduled for deprecation (std.ctype, std.string, and std.file all 
have some good examples of this). Unfortunately, the only way at this point to 
print a message about a function being scheduled for deprecation is to use a 
pragma inside of the function - which only works with templated functions. 
However, there's been a number of complaints on the Announce list about the 
messages which are currently printed for functions which are scheduled for 
deprecation, so I don't know how we want to handle such messages. At minimum, 
the old functions should be in the documentation and labeled as scheduled for 
deprecation rather than deprecated.

- Jonathan M Davis

Jul 14 2011

"Nick Sabalausky" <a a.a> writes:

"dsimcha" <dsimcha yahoo.com> wrote in message 
news:ivo271$1fu0$1 digitalmars.com...
 3.  All the stuff from the old std.path should be "scheduled for 
 deprecation".  Deprecating stuff breaks people's build processes and 
 should only be done after adequate warning.  I don't know a good way to 
 implement this for enums and aliases, though, which I guess begs the 
 question of whether DMD should support soft deprecation.

Isn't that what -d is for?

 1.  Do we really need currentDirSymbol and parentDirSymbol?  AFAIK they're 
 standard across all operating systems that anyone cares about and everyone 
 can easily remember them.  I have never used the equivalents in the old 
 std.path and have never missed them.

 4.  This module might be useful at compile time.  As pointed out on this 
 forum, Phobos should eventually include tests for CTFE-ability.  It would 
 be nice if std.path had a few CTFE unit tests sprinkled in.

Jul 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 15 July 2011 04:13:35 Nick Sabalausky wrote:
 "dsimcha" <dsimcha yahoo.com> wrote in message
 news:ivo271$1fu0$1 digitalmars.com...
 
 3.  All the stuff from the old std.path should be "scheduled for
 deprecation".  Deprecating stuff breaks people's build processes and
 should only be done after adequate warning.  I don't know a good way to
 implement this for enums and aliases, though, which I guess begs the
 question of whether DMD should support soft deprecation.

 
 Isn't that what -d is for?

The way that deprecation is supposed to work is in three stages:

1. Scheduled for deprecation.
2. Deprecated.
3. Removed.

In stage 1, users are notified that a symbol is going to be deprecated, but it 
is not yet deprecated. This gives them time to alter their code accordingly 
before they will have to build with -d. In stage 2, the symbol is actually 
deprecated with the deprecated keyword, and -d is required to compile code 
which uses that symbol. If the user has not yet altered their code, then they 
will have to alter their build scripts to use -d. Finally, in stage 3, the 
deprecated symbol is removed.

In stage 1, when a symbol is scheduled for deprecation, the documentation is 
altered to reflect that, and ideally a message would be printed out if it's 
used so that programmers will be better notified about the impending 
deprecation. However, since the only tool we have for that at the moment is 
pragmas, only entire modules and templated functions or types or types can 
have such messages (since otherwise the message would print regardless of 
whether you use the symbol). A number of complaints have arisen about these 
messages, however  (a number were introduced in 2.054 - both for stuff which 
had already scheduled for deprecation and for stuff which was just scheduled 
for deprecation). So, we may end up ditching the pragma messages. I don't 
know.

In either case, Walter has been pretty adamant about not breaking users' code 
without notice. So, immediately deprecating something (thereby breaking their 
code - even if all it takes is changing their build scripts to use -d) is not 
generally acceptable - hence the "scheduled for deprecation" phase.

The current plan is that stage 1 and stage 2 both take 6 months each, but we 
may adjust that depending on what's being changed (e.g.g smaller changes or 
changes of recently added stuff may take less time).

- Jonathan M Davis

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 14 Jul 2011 20:33:20 -0400, dsimcha wrote:

 ...And now I'll do the honors and be the first to review.  Overall this
 looks solid and well documented but I have a few nitpicks and questions
 
 1.  Do we really need currentDirSymbol and parentDirSymbol?  AFAIK
 they're standard across all operating systems that anyone cares about
 and everyone can easily remember them.  I have never used the
 equivalents in the old std.path and have never missed them.

I agree.  (I didn't really want to keep them, but I wasn't sure whether 
anyone used them.)  If I don't hear any objections, I will remove 
currentDirSymbol and parentDirSymbol.

 
 2.  There's really no need to always allocate a new array on things like
 setExtension.  You could just append the extension in the case where the
 extension doesn't already exist and the string is immutable.  I'm not
 sure this is worth the extra complexity, though, given that setExtension
 will probably never be used in perf-critical code.

I guess it depends on what is the most common case.  If "immutable 
without extension" is the most common case then it could be worth it.  
Otherwise it just adds code bloat for little gain.

 
 3.  All the stuff from the old std.path should be "scheduled for
 deprecation".  Deprecating stuff breaks people's build processes and
 should only be done after adequate warning.  I don't know a good way to
 implement this for enums and aliases, though, which I guess begs the
 question of whether DMD should support soft deprecation.

True.  As Jonathan points out, there is ongoing discussion about how best 
to do deprecation.  As has also been pointed out elsewhere, the old 
functions should be kept as they are, so behaviour doesn't silently 
change.  I'll add them back in, and include "scheduled for deprecation" 
pragma(msg)s where possible.


 4.  This module might be useful at compile time.  As pointed out on this
 forum, Phobos should eventually include tests for CTFE-ability.  It
 would be nice if std.path had a few CTFE unit tests sprinkled in.

I'm not sure I agree with this.  As I understand it, most of D should 
eventually be usable in CTFE.  (Don has made enormous progress in this 
regard.)  In most cases we therefore shouldn't have to change Phobos to 
make it CTFEable, it should Just Work.  On that note, it shouldn't be 
necessary to add CTFE-specific unittests either.

-Lars

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ivpk4e$gf5$1 digitalmars.com...
 4.  This module might be useful at compile time.  As pointed out on this
 forum, Phobos should eventually include tests for CTFE-ability.  It
 would be nice if std.path had a few CTFE unit tests sprinkled in.

 I'm not sure I agree with this.  As I understand it, most of D should
 eventually be usable in CTFE.  (Don has made enormous progress in this
 regard.)  In most cases we therefore shouldn't have to change Phobos to
 make it CTFEable, it should Just Work.  On that note, it shouldn't be
 necessary to add CTFE-specific unittests either.

I totally understand that reasoning, but as I pointed out in the OP of the 
big CTFE-ability thread: We're currently in a situation where basic 
commonly-needed-in-CTFE functions are frequently breaking at compile-time, 
which forces D users to avoid using Phobos at compile time and reimplement 
any needed Phobos functions themselves. It would be better to just 1. Have 
unittests to detect breakages of at least the functions that are likely to 
be needed at compile-time, and 2. Insert quick-n-dirty CTFE-able alternate 
codepaths (via "if(_ctfe)") around any part of the affected functions that's 
broken at compile time.

Plus, the unittests can then stay in to help detect CTFE regressions. In 
fact, one of the CTFE bugs I reported one time was a result of a 
CTFE-ability unittest I had for one of my functions. (Although, it was a 
function that was specifically written to be used at compile time...So it 
was really just a unittest to make sure it worked as intended...)

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 15:57:58 -0400, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ivpk4e$gf5$1 digitalmars.com...
 4.  This module might be useful at compile time.  As pointed out on
 this forum, Phobos should eventually include tests for CTFE-ability. 
 It would be nice if std.path had a few CTFE unit tests sprinkled in.

 I'm not sure I agree with this.  As I understand it, most of D should
 eventually be usable in CTFE.  (Don has made enormous progress in this
 regard.)  In most cases we therefore shouldn't have to change Phobos to
 make it CTFEable, it should Just Work.  On that note, it shouldn't be
 necessary to add CTFE-specific unittests either.

 I totally understand that reasoning, but as I pointed out in the OP of
 the big CTFE-ability thread: We're currently in a situation where basic
 commonly-needed-in-CTFE functions are frequently breaking at
 compile-time, which forces D users to avoid using Phobos at compile time
 and reimplement any needed Phobos functions themselves. It would be
 better to just 1. Have unittests to detect breakages of at least the
 functions that are likely to be needed at compile-time, and 2. Insert
 quick-n-dirty CTFE-able alternate codepaths (via "if(_ctfe)") around any
 part of the affected functions that's broken at compile time.
 
 Plus, the unittests can then stay in to help detect CTFE regressions. In
 fact, one of the CTFE bugs I reported one time was a result of a
 CTFE-ability unittest I had for one of my functions. (Although, it was a
 function that was specifically written to be used at compile time...So
 it was really just a unittest to make sure it worked as intended...)

I didn't notice your CTFE-ability thread until now, because I was away on 
vacation when you posted it.  I'll make sure to read it, maybe it will 
convince me. ;)

-Lars

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Lars, is normalize basically that feature request (toNativePath) I
asked for? It does seem to do more than that though.

Jul 14 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 02:35:02 +0200, Andrej Mitrovic wrote:

 Lars, is normalize basically that feature request (toNativePath) I asked
 for? It does seem to do more than that though.

Oops, sorry.  Because your feature request came in on a different channel 
than the others (my GitHub inbox), I completely forgot about it.

normalize() is different -- it changes '/' to '\' on Windows, but it 
doesn't do anything with '\' on POSIX, since the backslash is a perfectly 
valid filename character on that platform.

Basically, Andrej has requested I add a toNativePath function which 
replaces all occurrences of '/' in a string with '\' on Windows, and vice 
versa on POSIX.  Thinking some more about it, I personally don't think 
I'll find this very useful, but I am interested in hearing others' 
opinions.

Does anyone else think toNativePath should be included?

-Lars

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

The thing although Windows works with forward slashes some tools might
not work with forward slashes, so toNativePath could be useful there.

Then again for example GIT doesn't work with backward slashes (at
least .gitignore doesn't want to work with it). So maybe it should
have an optional parameter that chooses between Posix or Windows
paths..

Or we can just use string.replace("/", r"\"); But having it in
std.path would be more useful.

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

The thing is*

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

All functions have nice names except fcmp, why the abbreviation?

Is expandTilde supposed to work on Windows? I've never seen tilde used there.

Jul 14 2011

"Nick Sabalausky" <a a.a> writes:

"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.1654.1310690711.14074.digitalmars-d puremagic.com...
 All functions have nice names except fcmp, why the abbreviation?

 Is expandTilde supposed to work on Windows? I've never seen tilde used 
 there.

The docs say that it does nothing on Windows: "For Windows, expandTilde() 
merely returns its argument inputPath."

This is probably for the best because the Windows equivilent to "/home/user" 
can be one of a number of different locations depending on what exactly 
you're storing. If expandTilde did work on Windows, then no matter what it 
expanded to, it would be wrong at least half the time.

Jul 15 2011

Mafi <mafi example.org> writes:

Am 15.07.2011 10:22, schrieb Nick Sabalausky:
 "Andrej Mitrovic"<andrej.mitrovich gmail.com>  wrote in message
 news:mailman.1654.1310690711.14074.digitalmars-d puremagic.com...
 All functions have nice names except fcmp, why the abbreviation?

 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

 The docs say that it does nothing on Windows: "For Windows, expandTilde()
 merely returns its argument inputPath."

 This is probably for the best because the Windows equivilent to "/home/user"
 can be one of a number of different locations depending on what exactly
 you're storing. If expandTilde did work on Windows, then no matter what it
 expanded to, it would be wrong at least half the time.

What about longname becoming longna~1 on dos/emulators etc. Expanding 
this back needs information about the actual state of the file system so 
I don't now if such a function belongs to std.path but a longname.png => 
longna~1.png belongs IMO to std.path .

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 14:12:55 +0200, Mafi wrote:

 What about longname becoming longna~1 on dos/emulators etc. Expanding
 this back needs information about the actual state of the file system so
 I don't now if such a function belongs to std.path but a longname.png =>
 longna~1.png belongs IMO to std.path .

That too, requires information about the state of the filesystem.  For 
instance, if the file "longnaps" already exists, and is abbreviated to 
"longna~1", the correct abbreviation of "longname" would be "longna~2".

-Lars

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 02:45:02 +0200, Andrej Mitrovic wrote:

 All functions have nice names except fcmp, why the abbreviation?

For consistency with std.algorithm.cmp and std.string.icmp, on which it 
is based.  I'd be open for other suggestions if people think it should be 
changed.


 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

Nope, as specified in the docs it is a no-op on Windows.  As Nick points 
out, it is not obvious what is the correct "home directory" on Windows.  
Moreover, and perhaps even more importantly, there is no precedence for 
the tilde character to mean "home directory" on Windows at all.

-Lars

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ivpl34$gf5$3 digitalmars.com...
 On Fri, 15 Jul 2011 02:45:02 +0200, Andrej Mitrovic wrote:

 All functions have nice names except fcmp, why the abbreviation?

 For consistency with std.algorithm.cmp and std.string.icmp, on which it
 is based.  I'd be open for other suggestions if people think it should be
 changed.

I'm happy with it having the obscure name "fcmp" because *all* it is is 
either a case-insensitive or case-sensitive string comparison, and I think 
it's dangerous to blindly use that to compare filepaths. If it were named 
something like "filePathCompare", then people would get a nastly surprise 
when this returned false instead of true:

filePathCompare(r"C:\dir\file.txt", r"C:\dir/foo\..\file.txt");

I can't imagine getting false from that would ever be desirable.

This is a big part of why I think we need a canonical() function (although 
maybe a more like "canonicalPath" for the same reason that we have 
"joinPath" instead of "join").

 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

 Nope, as specified in the docs it is a no-op on Windows.  As Nick points
 out, it is not obvious what is the correct "home directory" on Windows.
 Moreover, and perhaps even more importantly, there is no precedence for
 the tilde character to mean "home directory" on Windows at all.

Actually, if Windows did have one single equivalent to "/home/{user}", then 
I think it would be absolutely fantastic to make tilde mean "home directory" 
on Windows. But as things are, that's a moot point, of course.

Jul 15 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 15 Jul 2011 15:21:49 -0400, Nick Sabalausky <a a.a> wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ivpl34$gf5$3 digitalmars.com...
 On Fri, 15 Jul 2011 02:45:02 +0200, Andrej Mitrovic wrote:

 All functions have nice names except fcmp, why the abbreviation?

 For consistency with std.algorithm.cmp and std.string.icmp, on which it
 is based.  I'd be open for other suggestions if people think it should  
 be
 changed.

 I'm happy with it having the obscure name "fcmp" because *all* it is is
 either a case-insensitive or case-sensitive string comparison, and I  
 think
 it's dangerous to blindly use that to compare filepaths. If it were named
 something like "filePathCompare", then people would get a nastly surprise
 when this returned false instead of true:

 filePathCompare(r"C:\dir\file.txt", r"C:\dir/foo\..\file.txt");

 I can't imagine getting false from that would ever be desirable.

 This is a big part of why I think we need a canonical() function  
 (although
 maybe a more like "canonicalPath" for the same reason that we have
 "joinPath" instead of "join").

 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

 Nope, as specified in the docs it is a no-op on Windows.  As Nick points
 out, it is not obvious what is the correct "home directory" on Windows.
 Moreover, and perhaps even more importantly, there is no precedence for
 the tilde character to mean "home directory" on Windows at all.

 Actually, if Windows did have one single equivalent to "/home/{user}",  
 then
 I think it would be absolutely fantastic to make tilde mean "home  
 directory"
 on Windows. But as things are, that's a moot point, of course.

The typical thing is to use the environment variable USERPROFILE, but  
that's an odd dependency for a string processing library.  Windows also  
doesn't keep the same notion of home directory as Unix does.  You can read  
more about it here:  http://en.wikipedia.org/wiki/Environment_variable

If that's the only function that uses environment variables, and you can't  
have the function without it, it might be worth having it.  However, we  
must think about how that affects things like purity.

-Steve

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.vyoeirtveav7ka localhost.localdomain...
 On Fri, 15 Jul 2011 15:21:49 -0400, Nick Sabalausky <a a.a> wrote:

 Actually, if Windows did have one single equivalent to "/home/{user}", 
 then
 I think it would be absolutely fantastic to make tilde mean "home 
 directory"
 on Windows. But as things are, that's a moot point, of course.

 The typical thing is to use the environment variable USERPROFILE, but 
 that's an odd dependency for a string processing library.  Windows also 
 doesn't keep the same notion of home directory as Unix does.  You can read 
 more about it here:  http://en.wikipedia.org/wiki/Environment_variable

 If that's the only function that uses environment variables, and you can't 
 have the function without it, it might be worth having it.  However, we 
 must think about how that affects things like purity.

No, like I've said, all the possible choices are frequently wrong, and that 
includes USERPROFILE.

If you're selecting a default directory for the user to save/load a word 
processing document, then USERPROFILE is wrong (should be the user's My 
Documents - or the last used directory, whatever).

If you're storing per-user application-settings/internal-application-data, 
then USERPROFILE is wrong. It should be APPDATA if it's roaming data (should 
remain with the user on different computers). Or if it's machine-specific 
(non-roaming) data, then it belongs in whatever the env variable for 
"%APPDATA%\Local Settings\Application Data" is.

So expanding tilde to USERPROFILE will just encourage people to do the wrong 
thing.

Jul 15 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 15 Jul 2011 16:15:23 -0400, Nick Sabalausky <a a.a> wrote:

 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message
 news:op.vyoeirtveav7ka localhost.localdomain...
 On Fri, 15 Jul 2011 15:21:49 -0400, Nick Sabalausky <a a.a> wrote:

 Actually, if Windows did have one single equivalent to "/home/{user}",
 then
 I think it would be absolutely fantastic to make tilde mean "home
 directory"
 on Windows. But as things are, that's a moot point, of course.

 The typical thing is to use the environment variable USERPROFILE, but
 that's an odd dependency for a string processing library.  Windows also
 doesn't keep the same notion of home directory as Unix does.  You can  
 read
 more about it here:  http://en.wikipedia.org/wiki/Environment_variable

 If that's the only function that uses environment variables, and you  
 can't
 have the function without it, it might be worth having it.  However, we
 must think about how that affects things like purity.

 No, like I've said, all the possible choices are frequently wrong, and  
 that
 includes USERPROFILE.

 If you're selecting a default directory for the user to save/load a word
 processing document, then USERPROFILE is wrong (should be the user's My
 Documents - or the last used directory, whatever).

 If you're storing per-user  
 application-settings/internal-application-data,
 then USERPROFILE is wrong. It should be APPDATA if it's roaming data  
 (should
 remain with the user on different computers). Or if it's machine-specific
 (non-roaming) data, then it belongs in whatever the env variable for
 "%APPDATA%\Local Settings\Application Data" is.

 So expanding tilde to USERPROFILE will just encourage people to do the  
 wrong
 thing.

Certainly not having that function on Windows will encourage them to  
choose a wrong choice then, no?  I mean if they want to find a "home  
directory" and phobos doesn't support that, then they google online, find  
something like %userprofile%, use it, and maybe file a bug against phobos  
for good measure :)

So there is no right answer, we should at least come up with *an* answer.   
In the Windows world of de-facto standards, we're bound to start a trend ;)

What about this?

~/Application Data => %APPDATA%
~/Local Settings/Application Data => %LOCALAPPDATA% or %USERPROFILE%\Local  
Settings\Application Data
etc.
Last resort:
~ => %USERPROFILE%

In other words, expand more than just the tilde.

My thoughts are, if we try and take a stab at making something  
intelligently decide where to store files, people will appreciate it more  
than having to search the web for a right answer to something that doesn't  
have one.

-Steve

Jul 18 2011

David Nadlinger <see klickverbot.at> writes:

On 7/18/11 3:55 PM, Steven Schveighoffer wrote:
 In other words, expand more than just the tilde.

 My thoughts are, if we try and take a stab at making something
 intelligently decide where to store files, people will appreciate it
 more than having to search the web for a right answer to something that
 doesn't have one.

Just a quick idea: Maybe we could provide a set of standard directories 
(e.g. UserData.config, UserData.cache, …) along with a function which 
maps them to the the operating system defaults, i.e. the well-known 
directories on Windows, the $XDG_*_HOME ones on Linux, etc.

David

Jul 18 2011

torhu <no spam.invalid> writes:

On 18.07.2011 15:55, Steven Schveighoffer wrote:
...
 Certainly not having that function on Windows will encourage them to
 choose a wrong choice then, no?  I mean if they want to find a "home
 directory" and phobos doesn't support that, then they google online, find
 something like %userprofile%, use it, and maybe file a bug against phobos
 for good measure :)

 So there is no right answer, we should at least come up with *an* answer.
 In the Windows world of de-facto standards, we're bound to start a trend ;)

 What about this?

 ~/Application Data =>  %APPDATA%
 ~/Local Settings/Application Data =>  %LOCALAPPDATA% or %USERPROFILE%\Local
 Settings\Application Data
 etc.
 Last resort:
 ~ =>  %USERPROFILE%

 In other words, expand more than just the tilde.

Wouldn't that tend to cause the same issue? They use tilde on Linux, 
then get %userprofile% because they didn't change it when they port to 
Windows.  If people know that they want %appdata%, they could just use that.

 My thoughts are, if we try and take a stab at making something
 intelligently decide where to store files, people will appreciate it more
 than having to search the web for a right answer to something that doesn't
 have one.

That could be done, here's an example I'm familiar with:
http://www.allegro.cc/manual/5/al_get_standard_path

This is designed to work across all the major platforms.  Some paths 
will be the same on some platforms.  ALLEGRO_USER_DOCUMENTS_PATH, 
ALLEGRO_USER_DATA_PATH, and ALLEGRO_USER_SETTINGS_PATH are probably the 
most relevant ones for this discussion.

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Jul 2011 14:29:08 -0400, torhu <no spam.invalid> wrote:

 On 18.07.2011 15:55, Steven Schveighoffer wrote:
 ...
 Certainly not having that function on Windows will encourage them to
 choose a wrong choice then, no?  I mean if they want to find a "home
 directory" and phobos doesn't support that, then they google online,  
 find
 something like %userprofile%, use it, and maybe file a bug against  
 phobos
 for good measure :)

 So there is no right answer, we should at least come up with *an*  
 answer.
 In the Windows world of de-facto standards, we're bound to start a  
 trend ;)

 What about this?

 ~/Application Data =>  %APPDATA%
 ~/Local Settings/Application Data =>  %LOCALAPPDATA% or  
 %USERPROFILE%\Local
 Settings\Application Data
 etc.
 Last resort:
 ~ =>  %USERPROFILE%

 In other words, expand more than just the tilde.

 Wouldn't that tend to cause the same issue? They use tilde on Linux,  
 then get %userprofile% because they didn't change it when they port to  
 Windows.  If people know that they want %appdata%, they could just use  
 that.

So without expansion they get some local directory named ~ created (~ is a  
valid directory name in Windows).  That sounds worse...

I'd rather have it do something sane when I'm not smart enough to  
special-code Windows stuff, but when I do want to handle it, I'd like to  
not have to deal with the environment (i.e. differences between XP and  
Win7).

 My thoughts are, if we try and take a stab at making something
 intelligently decide where to store files, people will appreciate it  
 more
 than having to search the web for a right answer to something that  
 doesn't
 have one.

 That could be done, here's an example I'm familiar with:
 http://www.allegro.cc/manual/5/al_get_standard_path

 This is designed to work across all the major platforms.  Some paths  
 will be the same on some platforms.  ALLEGRO_USER_DOCUMENTS_PATH,  
 ALLEGRO_USER_DATA_PATH, and ALLEGRO_USER_SETTINGS_PATH are probably the  
 most relevant ones for this discussion.

I don't know if it's worth mapping the Windows directory scheme on to  
Linux any more than it's worth mapping the Linux conventions onto the  
Windows ones.  The one thing that's common is, both have a OS-specified  
location to store a specific user's file (Home directory).  I realize that  
using %USERPROFILE% is not *correct*.  But it's better than doing  
*nothing*.

So my view is:

~ => by itself it maps to your home directory, be that /home/you or  
%USERPROFILE%.
~/xyz => on Linux maps to a subdirectory xyz under home, on Windows, may  
be a more intelligent expansion.

-Steve

Jul 18 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 7/15/11, Nick Sabalausky <a a.a> wrote:
 Actually, if Windows did have one single equivalent to "/home/{user}", then
 I think it would be absolutely fantastic to make tilde mean "home directory"
 on Windows. But as things are, that's a moot point, of course.

%UserProfile%

http://en.wikipedia.org/wiki/Home_directory#Default_Home_Directory_per_Operating_System

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Also, you can change the location of your home directory, although
it's not recommended since some applications like to hardwire
directories.UserProfile should reflect the changes though.

Jul 15 2011

bearophile <bearophileHUGS lycos.com> writes:

Just few comments to the source code. I have not used it yet.

version (Windows) private alias Signed!size_t ssize_t;

Why not just ptrdiff_t ?

-------------------

In theory this too is (in C[] path) but I presume you have found troubles:
private C[] chompDirSeparators(C)(C[] path)   safe pure nothrow

-------------------

So here you have had to use Unqual
immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext)
immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[] path, in C2[] ext)

Instead of Unqual isn't it nicer to use a Deconst!() template?

-------------------

// Detects whether the given types are all string types of the same width
private template Strings...)  if (Strings.length > 0)

I think that a CTFE function with a static foreach is better/faster/simpler
than the recursive compatibleStrings() template.

-------------------

unittest
{

I suggest to add a comment that tells what function/class/struct is tested in
this unittest (this is a poor's man named unittest).

Bye,
bearophile

Jul 14 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 14 Jul 2011 21:00:26 -0400, bearophile wrote:

 Just few comments to the source code. I have not used it yet.
 
 version (Windows) private alias Signed!size_t ssize_t;
 
 Why not just ptrdiff_t ?

I find ssize_t (which is standard on POSIX) to be much more descriptive 
of its purpose.

http://en.wikipedia.org/wiki/Size_t


 In theory this too is (in C[] path) but I presume you have found
 troubles: private C[] chompDirSeparators(C)(C[] path)   safe pure
 nothrow

The problem is that path will then be const(C[]), and it becomes 
impossible to return a slice of it as C[].  The correct solution is to 
use inout, and I intend to do so once inout is correctly implemented in 
DMD.


 So here you have had to use Unqual
 immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext)
 immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[] path, in C2[]
 ext)
 
 Instead of Unqual isn't it nicer to use a Deconst!() template?

Hmm, I guess you're right.  "shared" shouldn't be stripped, for instance.


 // Detects whether the given types are all string types of the same
 width private template Strings...)  if (Strings.length > 0)
 
 I think that a CTFE function with a static foreach is
 better/faster/simpler than the recursive compatibleStrings() template.

Maybe, but I also find it uglier in use due to the extra parentheses.

   if (compatibleStrings!(T, U, V)())

Personally, I don't think it's worth changing it.


 unittest
 {
 
 I suggest to add a comment that tells what function/class/struct is
 tested in this unittest (this is a poor's man named unittest).

I don't think that is necessary.  Looking at the unittests in this 
module, I think it is rather obvious what is being tested -- not to 
mention that the unittests always follow directly after the function 
being tested.

-Lars

Jul 15 2011

KennyTM~ <kennytm gmail.com> writes:

On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
  So here you have had to use Unqual
  immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext)
  immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[] path, in C2[]
  ext)

  Instead of Unqual isn't it nicer to use a Deconst!() template?


 Hmm, I guess you're right.  "shared" shouldn't be stripped, for instance.

Given that immutable( const(char) ) == immutable(char), I think the 
Unqual! should simply be removed.

Jul 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
  So here you have had to use Unqual
  immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in
  C2[] ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in
  C1[] path, in C2[] ext)
  
  Instead of Unqual isn't it nicer to use a Deconst!() template?


 
 Hmm, I guess you're right.  "shared" shouldn't be stripped, for
 instance.

 
 Given that immutable( const(char) ) == immutable(char), I think the
 Unqual! should simply be removed.

I'd still put the Unqual in there. Perhaps it's due to compiler bugs, but from 
what I've seen, it can get kind of funny when you try and have an immutable 
const or a const immutable. Using  Unqual makes it very clear what you mean.

- Jonathan M Davis

Jul 15 2011

KennyTM~ <kennytm gmail.com> writes:

On Jul 16, 11 00:05, Jonathan M Davis wrote:
 On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
   So here you have had to use Unqual
   immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in
   C2[] ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in
   C1[] path, in C2[] ext)

   Instead of Unqual isn't it nicer to use a Deconst!() template?


 Hmm, I guess you're right.  "shared" shouldn't be stripped, for
 instance.

 Given that immutable( const(char) ) == immutable(char), I think the
 Unqual! should simply be removed.

 I'd still put the Unqual in there. Perhaps it's due to compiler bugs, but from
 what I've seen, it can get kind of funny when you try and have an immutable
 const or a const immutable. Using  Unqual makes it very clear what you mean.

 - Jonathan M Davis

OK. But I think you should file the compiler bug :) From what I see, 
it's that 'immutable' always win.

--------------------------
alias const(char) CC;
alias immutable(char) IC;
alias immutable(CC) ICC;
alias const(IC) CIC;
pragma(msg, ICC);
pragma(msg, CIC);
--------------------------
immutable(char)
immutable(char)
--------------------------

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sat, 16 Jul 2011 00:17:03 +0800, KennyTM~ wrote:

 On Jul 16, 11 00:05, Jonathan M Davis wrote:
 On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
   So here you have had to use Unqual
   immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[]
   ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[]
   path, in C2[] ext)

   Instead of Unqual isn't it nicer to use a Deconst!() template?


 Hmm, I guess you're right.  "shared" shouldn't be stripped, for
 instance.

 Given that immutable( const(char) ) == immutable(char), I think the
 Unqual! should simply be removed.

 I'd still put the Unqual in there. Perhaps it's due to compiler bugs,
 but from what I've seen, it can get kind of funny when you try and have
 an immutable const or a const immutable. Using  Unqual makes it very
 clear what you mean.

 - Jonathan M Davis

 
 OK. But I think you should file the compiler bug :) From what I see,
 it's that 'immutable' always win.
 
 --------------------------
 alias const(char) CC;
 alias immutable(char) IC;
 alias immutable(CC) ICC;
 alias const(IC) CIC;
 pragma(msg, ICC);
 pragma(msg, CIC);
 --------------------------
 immutable(char)
 immutable(char)
 --------------------------

True, but this doesn't apply to the present case.  Since the parameters 
are marked with 'in', they become const(immutable(char)[]), not const
(immutable(char))[].  This isn't too hard to fix, but I prefer to use 
Unqual, Deconst, Mutable, or whatever it ends up being called.

-Lars

Jul 15 2011

KennyTM~ <kennytm gmail.com> writes:

On Jul 16, 11 04:16, Lars T. Kyllingstad wrote:
 On Sat, 16 Jul 2011 00:17:03 +0800, KennyTM~ wrote:

 On Jul 16, 11 00:05, Jonathan M Davis wrote:
 On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
    So here you have had to use Unqual
    immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[]
    ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[]
    path, in C2[] ext)

    Instead of Unqual isn't it nicer to use a Deconst!() template?


 Hmm, I guess you're right.  "shared" shouldn't be stripped, for
 instance.

 Given that immutable( const(char) ) == immutable(char), I think the
 Unqual! should simply be removed.

 I'd still put the Unqual in there. Perhaps it's due to compiler bugs,
 but from what I've seen, it can get kind of funny when you try and have
 an immutable const or a const immutable. Using  Unqual makes it very
 clear what you mean.

 - Jonathan M Davis

 OK. But I think you should file the compiler bug :) From what I see,
 it's that 'immutable' always win.

 --------------------------
 alias const(char) CC;
 alias immutable(char) IC;
 alias immutable(CC) ICC;
 alias const(IC) CIC;
 pragma(msg, ICC);
 pragma(msg, CIC);
 --------------------------
 immutable(char)
 immutable(char)
 --------------------------

 True, but this doesn't apply to the present case.  Since the parameters
 are marked with 'in', they become const(immutable(char)[]), not const
 (immutable(char))[].  This isn't too hard to fix, but I prefer to use
 Unqual, Deconst, Mutable, or whatever it ends up being called.

 -Lars

Even if `typeof(path)` becomes `const(string)`, C1 is still an 
`immutable(char)`, so `immutable(C1)[]` will still work.

------------------------------------
immutable(C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext) {
     pragma(msg, typeof(return), " <- ", C1);
     return typeof(return).init;
}
void main() {
     setExtension("1", "2");
     setExtension("3".dup, "4".dup);
     setExtension(cast(const)"5".dup, cast(const)"6".dup);
}
------------------------------------
string <- immutable(char)
string <- char
string <- const(char)
------------------------------------

Jul 15 2011

bearophile <bearophileHUGS lycos.com> writes:

dsimcha:

 Author's comments -- please read before reviewing:

 Thirdly, I have applied  safe, pure and nothrow wherever possible.
 There were cases where this was not possible, either due to bugs in DMD
 (specifically, 5304, 5700 and 5798) or due to functions in other Phobos
 modules not being properly marked with these attributes.  I have marked
 these with //TODO comments and will fix them as soon as possible.

Adding pure and nothrow to functions in other Phobos modules is a good thing.
There are functions like to!int("12") that to become pure require some other
functions to become pure first. I think that currently functions that contain
enforce can't be pure.

Bye,
bearophile

Jul 14 2011

KennyTM~ <kennytm gmail.com> writes:

On Jul 15, 11 10:22, bearophile wrote:
 dsimcha:

 Author's comments -- please read before reviewing:

 Thirdly, I have applied  safe, pure and nothrow wherever possible.
 There were cases where this was not possible, either due to bugs in DMD
 (specifically, 5304, 5700 and 5798) or due to functions in other Phobos
 modules not being properly marked with these attributes.  I have marked
 these with //TODO comments and will fix them as soon as possible.

 Adding pure and nothrow to functions in other Phobos modules is a good thing.
 There are functions like to!int("12") that to become pure require some other
functions to become pure first. I think that currently functions that contain
enforce can't be pure.

 Bye,
 bearophile

The 'enforce' problem is simple. It could be easier replaced by the 
'if(...) throw' construct when not composing, and Don had made a patch 
for bug 5750 to make it pure also.

The big problem is GC.malloc (a.k.a. std.array.uninitializedArray) and 
other GC functions (a.k.a. std.array.appender). They cannot be easily 
replaced without sacrificing performance. (I believe all these allocator 
functions should be made weakly-pure (bug 6151), but so far there isn't 
official decision).

to!int("12") is not pure because of:
  - std.array.front and popFront, because of:


        - std.conv.to!(string, size_t), because of:
           - GC.malloc, or std.array.uninitializedArray if Phobos pull 

        - std.conv.to!(string, const(ubyte)[]), because of:
           - std.conv.to!(string, ubyte), because of 
std.conv.to!(string, uint)
           - std.array.appender, because of:
              - GC.extend
              - GC.qalloc
              - memcpy (just need a 'pure' annotation)
        * All of the above are only because of the error-reporting 
facility. The main content of std.utf.decode is pure and  safe. In 
particular, I don't think it's proper to throw the whole string's 
encoding back to the user.
  - ConvOverflowException.raise, just lacking a 'pure' annotation
  - convError, because of std.conv.to!(string, uint).

pure even if the function body is all pure.

Jul 15 2011

KennyTM~ <kennytm gmail.com> writes:

IANAL, but there should be mention of the Boost license somewhere.

ssize_t (Line 50):     Why not use sizediff_t?
Line 58..67: I think version blocks should be like

     version (Windows)
     {
       ...
     }
     else version (Posix)
     {
       ...
     }
     else
        static assert(0, "unsupported platform");


pathSplitter: Is it possible to make it a bidirectional range (implement 
.back and .popBack())?

pathCharMatch: On OS X the file system is case-insensitive by default. 
Also, I'm not sure if restricting to ASCII is fine. Actually the 
case-sensitivity is independent of the platform, e.g. you can configure 
a case-sensitive disk on Windows.

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"KennyTM~" <kennytm gmail.com> wrote in message 
news:ivosru$2v11$1 digitalmars.com...
 pathCharMatch: On OS X the file system is case-insensitive by default. 
 Also, I'm not sure if restricting to ASCII is fine. Actually the 
 case-sensitivity is independent of the platform, e.g. you can configure a 
 case-sensitive disk on Windows.

Fun!

/me shoots self

Jul 15 2011

Jacob Carlborg <doob me.com> writes:

On 2011-07-15 10:08, KennyTM~ wrote:
 IANAL, but there should be mention of the Boost license somewhere.

 ssize_t (Line 50): Why not use sizediff_t?
 Line 58..67: I think version blocks should be like

 version (Windows)
 {
 ...
 }
 else version (Posix)
 {
 ...
 }
 else
 static assert(0, "unsupported platform");


 pathSplitter: Is it possible to make it a bidirectional range (implement
 .back and .popBack())?

 pathCharMatch: On OS X the file system is case-insensitive by default.
 Also, I'm not sure if restricting to ASCII is fine. Actually the
 case-sensitivity is independent of the platform, e.g. you can configure
 a case-sensitive disk on Windows.

It's possible to do that on Mac OS X as well.

-- 
/Jacob Carlborg

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 16:08:37 +0800, KennyTM~ wrote:

 IANAL, but there should be mention of the Boost license somewhere.

Oops, seems I forgot.


 ssize_t (Line 50):     Why not use sizediff_t?

Mainly because I wasn't aware of its existence. ;)  I'll fix.

(I still think ssize_t is a better name, though, and now that you mention 
it, I recall Andrei and Walter discussing this on the Phobos mailing list 
a few months ago.)


 Line 58..67: I think
 version blocks should be like
 
      version (Windows)
      {
        ...
      }
      else version (Posix)
      {
        ...
      }
      else
         static assert(0, "unsupported platform");

Good point.

 
 pathSplitter: Is it possible to make it a bidirectional range (implement
 .back and .popBack())?

It is probably possible, but I don't think the increased complexity will 
be worth it.  Feel free to try to change my mind, though. ;)


 pathCharMatch: On OS X the file system is case-insensitive by default.
 Also, I'm not sure if restricting to ASCII is fine. Actually the
 case-sensitivity is independent of the platform, e.g. you can configure
 a case-sensitive disk on Windows.

Hmm.. how about I add an optional caseSensitive parameter (to both 
pathCharMatch and glob) that defaults to false on Windows and true on 
POSIX?

-Lars

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"dsimcha" <dsimcha yahoo.com> wrote in message 
news:ivo1ef$1ero$1 digitalmars.com...
 Lars Kyllingstad's new and improved std.path module for Phobos is the next 
 item up in the review queue.

Yay!

First of all, very well-written docs :)

My notes:

- For clarity and completeness, the docs for joinPath should include the 
example:

joinPath("foo", "bar");

It does already have joinPath(r"c:\foo", "bar"), but it's easy to overlook 
the fact that's demonstrating the same thing, especially if you're more 
interested in Posix.

- For generic purposes, maybe joinPath should accept being called with just 
one argument? If it already does, then the "Joins two or more path 
components" should change to "Joins one or more path components"

- Maybe there should be a function canonical() that resolves symbolic links, 
runs absolutePath() and normalize(), and on Windows converts to lowercase. 
Doesn't need to be in there right now though, could wait for a later 
release.

- Docs don't say how pathCharMatch handles slash and backslash.

- I really hate to bring up a function-naming issue at this point, but 
"glob" is commonly known to mean "Return all existing files/paths that match 
this pattern" (And that's something we should have at some point, btw, if we 
don't already). So maybe this glob should be called something like globMatch 
or matchGlob instead.

- I don't think I like how glob works. It doesn't seem to know anything 
about directory separators, which seems unintuitive and problematic. I'd 
really like to see it work like this (basically from Ruby):

assert( glob("dir/foo", "dir/*") );
assert( glob("dir/foo", "dir/**") );

assert( !glob("dir/foo/bar", "dir/*") );
assert( glob("dir/foo/bar", "dir/**") );

assert( glob("dir/foo/abc", "dir/*/abc") );
assert( glob("dir/foo/abc", "dir/**/abc") );

assert( !glob("dir/foo/bar/abc", "dir/*/abc") );
assert( glob("dir/foo/bar/abc", "dir/**/abc") );

assert( glob("dir/foo", "dir/f*o") );
assert( !glob("dir/faa/boo", "dir/f*o") );

As it is right now, how would you do a non-recursive star-match? Doesn't 
look like you even can.

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 04:33:34 -0400, Nick Sabalausky wrote:

 "dsimcha" <dsimcha yahoo.com> wrote in message
 news:ivo1ef$1ero$1 digitalmars.com...
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

 Yay!
 
 First of all, very well-written docs :)

Thanks! :)

 
 My notes:
 
 - For clarity and completeness, the docs for joinPath should include the
 example:
 
 joinPath("foo", "bar");
 
 It does already have joinPath(r"c:\foo", "bar"), but it's easy to
 overlook the fact that's demonstrating the same thing, especially if
 you're more interested in Posix.

Good point.


 - For generic purposes, maybe joinPath should accept being called with
 just one argument? If it already does, then the "Joins two or more path
 components" should change to "Joins one or more path components"

I'll look into it.

 
 - Maybe there should be a function canonical() that resolves symbolic
 links, runs absolutePath() and normalize(), and on Windows converts to
 lowercase. Doesn't need to be in there right now though, could wait for
 a later release.

I'd be more in favour of adding a resolveSymLinks function to std.file.  
In any case, I'd prefer it not be part of this proposal, and rather be 
added later.


 - Docs don't say how pathCharMatch handles slash and backslash.

True, I'll fix.


 - I really hate to bring up a function-naming issue at this point, but
 "glob" is commonly known to mean "Return all existing files/paths that
 match this pattern" (And that's something we should have at some point,
 btw, if we don't already). So maybe this glob should be called something
 like globMatch or matchGlob instead.
 
 - I don't think I like how glob works. It doesn't seem to know anything
 about directory separators, which seems unintuitive and problematic. I'd
 really like to see it work like this (basically from Ruby):
 
 assert( glob("dir/foo", "dir/*") );
 assert( glob("dir/foo", "dir/**") );
 
 assert( !glob("dir/foo/bar", "dir/*") ); assert( glob("dir/foo/bar",
 "dir/**") );
 
 assert( glob("dir/foo/abc", "dir/*/abc") ); assert( glob("dir/foo/abc",
 "dir/**/abc") );
 
 assert( !glob("dir/foo/bar/abc", "dir/*/abc") ); assert(
 glob("dir/foo/bar/abc", "dir/**/abc") );
 
 assert( glob("dir/foo", "dir/f*o") ); assert( !glob("dir/faa/boo",
 "dir/f*o") );
 
 As it is right now, how would you do a non-recursive star-match? Doesn't
 look like you even can.

I haven't done anything to glob (i.e. fnmatch in the current std.path) 
except change its name and improve its documentation.  For all intents 
and purposes, it is not part of my submission.  If I find the time, 
however, I'll look into improving it.

-Lars

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ivpnr5$gf5$7 digitalmars.com...
 On Fri, 15 Jul 2011 04:33:34 -0400, Nick Sabalausky wrote:

 - Maybe there should be a function canonical() that resolves symbolic
 links, runs absolutePath() and normalize(), and on Windows converts to
 lowercase. Doesn't need to be in there right now though, could wait for
 a later release.

 I'd be more in favour of adding a resolveSymLinks function to std.file.
 In any case, I'd prefer it not be part of this proposal, and rather be
 added later.

Oh, yea, I didn't mean to imply that there shouldn't be a resolveSymLinks 
function. But I strongly believe that a function which combines all of that 
(and maybe the 8.3 -> full-name expansion on Windows, and maybe expansions 
of things like %APPDATA%) and guarantees a single unique canonical name for 
each actual (existing or non-existing) file would be a very, very, very good 
thing to have. Essenital, in fact, for comparing filepaths.

In any case, I agree that this can all wait until a later time and doesn't 
need to be in this proposal.

 I haven't done anything to glob (i.e. fnmatch in the current std.path)
 except change its name and improve its documentation.  For all intents
 and purposes, it is not part of my submission.  If I find the time,
 however, I'll look into improving it.

I see. In that case, maybe its name should remain fnmatch for now (or fnglob 
if you really want "glob" in there). That would prevent confuson both now 
and whenever it gets improved.

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 15:34:37 -0400, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ivpnr5$gf5$7 digitalmars.com...
 On Fri, 15 Jul 2011 04:33:34 -0400, Nick Sabalausky wrote:

 - Maybe there should be a function canonical() that resolves symbolic
 links, runs absolutePath() and normalize(), and on Windows converts to
 lowercase. Doesn't need to be in there right now though, could wait
 for a later release.

 I'd be more in favour of adding a resolveSymLinks function to std.file.
 In any case, I'd prefer it not be part of this proposal, and rather be
 added later.

 Oh, yea, I didn't mean to imply that there shouldn't be a
 resolveSymLinks function. But I strongly believe that a function which
 combines all of that (and maybe the 8.3 -> full-name expansion on
 Windows, and maybe expansions of things like %APPDATA%) and guarantees a
 single unique canonical name for each actual (existing or non-existing)
 file would be a very, very, very good thing to have. Essenital, in fact,
 for comparing filepaths.
 
 In any case, I agree that this can all wait until a later time and
 doesn't need to be in this proposal.
 
 I haven't done anything to glob (i.e. fnmatch in the current std.path)
 except change its name and improve its documentation.  For all intents
 and purposes, it is not part of my submission.  If I find the time,
 however, I'll look into improving it.

 I see. In that case, maybe its name should remain fnmatch for now (or
 fnglob if you really want "glob" in there). That would prevent confuson
 both now and whenever it gets improved.

Fair enough.  I'll think of a better name and fix it.

-Lars

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

Since the details of many of these functions have changed, is it really a 
good idea to make the depricated names aliases of the new functions? If 
anyone has code that relies on the screwy old behavior (such as trying to 
compensate for their problems) then that code will silently break.

Jul 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 15 July 2011 04:38:35 Nick Sabalausky wrote:
 Since the details of many of these functions have changed, is it really a
 good idea to make the depricated names aliases of the new functions? If
 anyone has code that relies on the screwy old behavior (such as trying to
 compensate for their problems) then that code will silently break.

If the behavior has changed, then the old functions need to be there. 
Otherwise, code will silently break, which would be worse than just outright 
removing the functions.

- Jonathan M Davis

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 01:55:18 -0700, Jonathan M Davis wrote:

 On Friday 15 July 2011 04:38:35 Nick Sabalausky wrote:
 Since the details of many of these functions have changed, is it really
 a good idea to make the depricated names aliases of the new functions?
 If anyone has code that relies on the screwy old behavior (such as
 trying to compensate for their problems) then that code will silently
 break.

 
 If the behavior has changed, then the old functions need to be there.
 Otherwise, code will silently break, which would be worse than just
 outright removing the functions.

I'll put them back in.

-Lars

Jul 15 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the  
 next item up in the review queue.

Three notes:
1. No tests for extension("foo") is null && extension("foo.") !is null
2. absolutePath's signature could be changed to string absolutePath(string  
path, string base = getcwd()) for more flexibility at no penalty
3. Would be nice to have relativePath(path, base) to complement  
absolutePath().

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 7/15/11, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 2. absolutePath's signature could be changed to string absolutePath(string
 path, string base = getcwd()) for more flexibility at no penalty

+1.

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 17:15:25 +0300, Vladimir Panteleev wrote:

 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:
 
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

 
 Three notes:
 1. No tests for extension("foo") is null && extension("foo.") !is null
 2. absolutePath's signature could be changed to string
 absolutePath(string path, string base = getcwd()) for more flexibility
 at no penalty 3. Would be nice to have relativePath(path, base) to
 complement absolutePath().

1. I'll fix.
2. Good idea!
3. There are infinitely many possibilities for a relative path, and I 
don't know how useful this function will be.  Please convince me 
otherwise. :)

-lars

Jul 15 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Fri, 15 Jul 2011 18:55:58 +0300, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 3. There are infinitely many possibilities for a relative path

The shortest version is most useful. The idea is to do this:
1) find a common root
    (if none exists, return input absolute path as is)
2) add as many ../ as necessary to reach common root from base path
3) append second part of input path

 I don't know how useful this function will be.  Please convince me
 otherwise.

Some software and tools will commonly create and output absolute paths.  
There are non-relocatable; for example, if you want to copy a build script  
generated by an IDE, you need to update all absolute paths, or convert  
them to relative ones. My argument is that I recall looking for this  
function in the old std.path several times, and was disappointed not to  
find it.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 19:47:04 +0300, Vladimir Panteleev wrote:

 On Fri, 15 Jul 2011 18:55:58 +0300, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 3. There are infinitely many possibilities for a relative path

 
 The shortest version is most useful. The idea is to do this: 1) find a
 common root
     (if none exists, return input absolute path as is)
 2) add as many ../ as necessary to reach common root from base path 3)
 append second part of input path

Thinking some more about it, this algorithm should be easy to implement 
using pathSplitter.  I'll get onto it. :)

-Lars

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message 
news:op.vyny7zfetuzx1w cybershadow.mshome.net...
 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the 
 next item up in the review queue.

 Three notes:
 1. No tests for extension("foo") is null && extension("foo.") !is null

I don't know if you're talking about actual unittests or documentation, but 
that brings up a good point: The behavior of extension("foo.") should be 
clearly documented. (Unless it already is and I just overlooked it?)

 2. absolutePath's signature could be changed to string absolutePath(string 
 path, string base = getcwd()) for more flexibility at no penalty
 3. Would be nice to have relativePath(path, base) to complement 
 absolutePath().


totally right about "more flexibility at no penalty". And relativePath() is 
necessary for converting a path into a relocatable form, which definitely 
has its uses especially when dealing with paths that will end up being used 
on other computers.

Jul 15 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 15:42:50 -0400, Nick Sabalausky wrote:

 "Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message
 news:op.vyny7zfetuzx1w cybershadow.mshome.net...
 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

 Three notes:
 1. No tests for extension("foo") is null && extension("foo.") !is null

 
 I don't know if you're talking about actual unittests or documentation,
 but that brings up a good point: The behavior of extension("foo.")
 should be clearly documented. (Unless it already is and I just
 overlooked it?)

So what you guys are saying is, basically, that it matters whether a 
function returns null or ""?  I'd be inclined to say it doesn't, and 
leave it undefined, but I'd be interested in hearing good arguments 
against this.


 2. absolutePath's signature could be changed to string
 absolutePath(string path, string base = getcwd()) for more flexibility
 at no penalty 3. Would be nice to have relativePath(path, base) to
 complement absolutePath().


 you're totally right about "more flexibility at no penalty". And
 relativePath() is necessary for converting a path into a relocatable
 form, which definitely has its uses especially when dealing with paths
 that will end up being used on other computers.

There is a penalty, albeit a small one, because it needs to be verified 
that isAbsolute(base).

-Lars

Jul 15 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sat, 16 Jul 2011 00:26:15 +0300, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Fri, 15 Jul 2011 15:42:50 -0400, Nick Sabalausky wrote:

 "Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message
 news:op.vyny7zfetuzx1w cybershadow.mshome.net...
 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

 Three notes:
 1. No tests for extension("foo") is null && extension("foo.") !is null

 I don't know if you're talking about actual unittests or documentation,
 but that brings up a good point: The behavior of extension("foo.")
 should be clearly documented. (Unless it already is and I just
 overlooked it?)

 So what you guys are saying is, basically, that it matters whether a
 function returns null or ""?  I'd be inclined to say it doesn't, and
 leave it undefined, but I'd be interested in hearing good arguments
 against this.

My main argument for this is that this is the *documented* behavior of  
getExt in the current version of std.path. But yes, the difference between  
"" and null is the same as the difference between "empty" and "absent".

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jul 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

Names
-----------
dirSeparator:
I'd love to see this as something much shorter like dirSep. I think that 
dirSeparator is overly long - particularly for something which is likely to be 
used primarily in longer expressions.

currentDirSymbol and parentDirSymbol:
The same here. These are _way_ longer than the strings that they represent. If 
we have them, they should be much shorter (e.g. currDirSym and parDirSym). But 
really, I think that I have to concur with dsimcha on this one. The strings 
for these are completely standard, easy to remember, and even shorter names 
for them would still be longer than the actual strings by a fair margin. We 
should just toss them.

isDirSeparator:
I'd love to see this shortened to isDirSep.

extension:
I'd love to see this shortened to ext.

stripExtension:
I'd love to see this shortened to stripExt.

setExtensions:
I think that it should have replace in its name rather than set, since it's 
not altering the original string. It's replacing what's there, not setting it. 
And again, I think that Extension should be  shortened to Ext, so the renamed 
function would be replaceExt.

defaultExtension:
This function is not a property, so it probably shouldn't be a noun, though 
I'm not quite sure what to call it. Technically, it's something like 
setWithDefaultExtensionIfNoExtension, but that's obviously a horrible name 
even if you abbreviate some of it. So, I don't really have a better name, 
though if it's not gonig to be completely renamed, I definitely think that it 
should be shortened to defaultExt rather than having the full Extension in its 
name.

joinPath:
I'm not sure that this function should be renamed to joinPath. When discussing 
other modules such as as std.ascii and std.uni, we decided that they should 
have the same function names and _not_ put the module name in their names. 
Now, this function _is_ joining paths, so it's not purely a case of having 
the module's name in the function name, but I'm still not sure that we want to 
have it as joinPath instead of join.

absolutePath:
It really shouldn't be a noun, since it's not a property. Something like 
makeAbsolute would be better.

Implementation
------------------------
You should probably consider testing the various functions with varying string 
types - both in size and mutability. Just put the test in a foreach loop like

foreach (S; TypeTuple!(string, wstring, dstring))

(but with whatever combination of string types you want to test; In the ideal 
case it would be every possible combination of character type and mutability, 
but that's not necessarily necessary). Then you'll probably end up using 
to!S() on pretty much every string. And for functions which take multiple 
strings, you can test with a combination of string types. It'll make sure that 
the functions work appropriately with the various types. The functions are 
templated, but sometimes something gets missed and they don't work with a 
particular character size or level of mutability. So, while it might be 
overkill to test _every_ combination, verifying at least some of them would be 
desirable.

Personally, I think that it's better practice to test for empty rather than 
length == 0, but I don't suppose that it really matters given that we're 
always dealing with arrays here.

Your note on baseName about it adhering to the requirements for Posix's 
basename utility should probably put in the actual documentation. The same for 
dirName.

There's no point in checking for "== null" like you do in a few places. If you 
want to check whether it's actually null, you need to check for "is null", and 
if you want to check whether it's empty, then you should just use empty. "== 
null" has a tendancy to be misleading, because people tend to think that it 
actually checks whether an array is null, which it doesn't do. Personally, I'm 
likely to think that it's a bug when I see it.

It would probably be more efficient for setExtension to use ~=. e.g.

auto retval = stripExtension(path);
retval ~= '.';
retval ~= ext;
return cast(typeof(return))(retval);

Maybe it doesn't matter (and the code _is_  a bit less clean that way), but it 
might reduce the number of necessary heap allocations. The same goes for 
pretty much anywhere else that ~ is used. It probably doesn't matter all that 
much, but if we want to get the functions as efficient as possible, then it's 
better to use ~= than ~ unless the compiler is actually smart enough to 
tranlate the ~'s into ~='s, but I doubt that it is (particularly since some 
chunk of that is in druntime rather than in the compiler itself).

There's no such thing as an OutOfMemoryException. It's an OutOfMemoryError. 
And being an error, I don't think that there's any need to mention it in the 
documentation. _Anything_ which allocates memory could cause the application 
to run out of memory, and being an Error, you're really not supposed to catch 
it, so there's no real point in mentioning it in the documentation.


Okay. I think that that covers it, though I might have missed something. Aside 
from a few nitpicks, my main issue with the module as it stands is the 
unnecessarily long function names. The function names should be as long as is 
necessary to make them clear but not longer. And putting Extension and 
Separator in names is completely unnecessary. Ext and Sep are quite clear 
enough (especially in context) and are _much_ shorter. The longer names add 
nothing and make the functions/variables much more annoying to deal with in 
longer expressions.

So overall, I think that it looks good and is a definite improvement, but I 
really think that the function names need to be shortened.

- Jonathan M Davis

Jul 15 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1680.1310785695.14074.digitalmars-d puremagic.com...
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When 
 discussing
 other modules such as as std.ascii and std.uni, we decided that they 
 should
 have the same function names and _not_ put the module name in their names.
 Now, this function _is_ joining paths, so it's not purely a case of having
 the module's name in the function name, but I'm still not sure that we 
 want to
 have it as joinPath instead of join.

I really, really, REALLY, *REALLY* think it's a big mistake to categorically 
ban using the module name in the function names. And its *especially* bad in 
this case. Having both std.algorithm.join and std.path.join with their 
different semantics is just begging for problems. "join" is a terrible, 
terrible, awful, horrible name for this function.

I don't care if using conflicting names between std.ascii and std.uni was 
deemed OK. Every case needs to be judged on its own or else we're knee-deep 
in bondage & discipline design and we may as well call ourselves "Sun 
Microsystems" and our language "Java".

And yes, the precedent set by std.ascii and std.uni is a factor in the 
decision, but I think the rest of the situation here easily overrides that 
concern. Hell, when people see "join" they think of the semantics of 
std.algortihm.join, *never* "Oh, 'join' means using path separators." And 
looking at the names of the arguments, or other context clues, isn't a valid 
counter-argument because when I see "join(path1, path2, filename)" I'm going 
to immediately think "bug, or at least overly fragile". And why not? From 
the name "join", I have no way to tell.

I don't give a crap about "Ext" vs "Extension" (they both get the job done), 
and I think that horse has already been beaten to death last time std.path 
was discussed. But calling joinPath "join", is bad, bad, bad, bad, BAD.

Jul 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 16 July 2011 01:10:02 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1680.1310785695.14074.digitalmars-d puremagic.com...
 
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When
 discussing
 other modules such as as std.ascii and std.uni, we decided that they
 should
 have the same function names and _not_ put the module name in their
 names. Now, this function _is_ joining paths, so it's not purely a case
 of having the module's name in the function name, but I'm still not
 sure that we want to
 have it as joinPath instead of join.

 
 I really, really, REALLY, *REALLY* think it's a big mistake to categorically
 ban using the module name in the function names. And its *especially* bad
 in this case. Having both std.algorithm.join and std.path.join with their
 different semantics is just begging for problems. "join" is a terrible,
 terrible, awful, horrible name for this function.
 
 I don't care if using conflicting names between std.ascii and std.uni was
 deemed OK. Every case needs to be judged on its own or else we're knee-deep
 in bondage & discipline design and we may as well call ourselves "Sun
 Microsystems" and our language "Java".
 
 And yes, the precedent set by std.ascii and std.uni is a factor in the
 decision, but I think the rest of the situation here easily overrides that
 concern. Hell, when people see "join" they think of the semantics of
 std.algortihm.join, *never* "Oh, 'join' means using path separators." And
 looking at the names of the arguments, or other context clues, isn't a valid
 counter-argument because when I see "join(path1, path2, filename)" I'm
 going to immediately think "bug, or at least overly fragile". And why not?
 From the name "join", I have no way to tell.
 
 I don't give a crap about "Ext" vs "Extension" (they both get the job done),
 and I think that horse has already been beaten to death last time std.path
 was discussed. But calling joinPath "join", is bad, bad, bad, bad, BAD.

Well, like I said, I'm not sure that naming it joinPath instead of join is a 
good idea, but I'm not sure that it's a bad idea either. In this particular 
case, path is not only the module name, but it's actually what's being joined. 
So, I'm not entirely comfortable naming it joinPath, but I'm not entirely 
comfortable naming it join either.

I think that the general decision to not put module names in function names 
needs to be brought up in this case, since that is essentially what is 
happening here, but it may very well be that this is a case where we should 
not follow that general rule.

- Jonathan M Davis

Jul 15 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I was quite fond of it being renamed to joinPath, because when I'm
doing string processing on paths I import both std.algorithm/std.array
and std.path, and so far I've always had conflicts with join()
functions.

Jul 16 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 15 Jul 2011 20:08:03 -0700, Jonathan M Davis wrote:

 Names
 -----------
 dirSeparator:
 I'd love to see this as something much shorter like dirSep. I think that
 dirSeparator is overly long - particularly for something which is likely
 to be used primarily in longer expressions.
 
 currentDirSymbol and parentDirSymbol: The same here. These are _way_
 longer than the strings that they represent. If we have them, they
 should be much shorter (e.g. currDirSym and parDirSym). But really, I
 think that I have to concur with dsimcha on this one. The strings for
 these are completely standard, easy to remember, and even shorter names
 for them would still be longer than the actual strings by a fair margin.
 We should just toss them.
 
 isDirSeparator:
 I'd love to see this shortened to isDirSep.
 
 extension:
 I'd love to see this shortened to ext.
 
 stripExtension:
 I'd love to see this shortened to stripExt.

This has been discussed before, and even voted upon.  While the voting 
was very informal, with hardly enough participants to claim statistical 
significance, it at least showed that the current names have (perhaps 
even majority) support in the community:

http://www.digitalmars.com/d/archives/digitalmars/D/
Proposal_for_std.path_replacement_131121.html#N131332

 
 setExtensions:
 I think that it should have replace in its name rather than set, since
 it's not altering the original string. It's replacing what's there, not
 setting it. And again, I think that Extension should be  shortened to
 Ext, so the renamed function would be replaceExt.

To me, "replace" has an even stronger connotation than "set" that the 
original string is being modified.  Even so, that's actually not the 
reason I chose "set".  Rather, I think that

- replaceExtension implies that an existing extension gets replaced,
- addExtension (the current std.path uses addExt) implies that an
  extension gets added -- appended, that is -- regardless of whether
  one exists already, while
- setExtension implies that the extension is set to the given value
  whether the path already has one or not.

Maybe that's just me, but at least I'd like to get more opinions before 
changing it.


 defaultExtension:
 This function is not a property, so it probably shouldn't be a noun,
 though I'm not quite sure what to call it. Technically, it's something
 like setWithDefaultExtensionIfNoExtension, but that's obviously a
 horrible name even if you abbreviate some of it. So, I don't really have
 a better name, though if it's not gonig to be completely renamed, I
 definitely think that it should be shortened to defaultExt rather than
 having the full Extension in its name.
 
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When
 discussing other modules such as as std.ascii and std.uni, we decided
 that they should have the same function names and _not_ put the module
 name in their names. Now, this function _is_ joining paths, so it's not
 purely a case of having the module's name in the function name, but I'm
 still not sure that we want to have it as joinPath instead of join.

Nick expressed my feelings about this naming choice pretty well.  My main 
reason for not going with join() here is that this function does 
something very different from the other join().


 absolutePath:
 It really shouldn't be a noun, since it's not a property. Something like
 makeAbsolute would be better.

I originally called it toAbsolute(), but someone complained about that, 
too.  Sigh.  When it comes to function names, it is very hard to please 
everyone.


 Implementation
 ------------------------
 You should probably consider testing the various functions with varying
 string types - both in size and mutability. Just put the test in a
 foreach loop like
 
 foreach (S; TypeTuple!(string, wstring, dstring))
 
 (but with whatever combination of string types you want to test; In the
 ideal case it would be every possible combination of character type and
 mutability, but that's not necessarily necessary). Then you'll probably
 end up using to!S() on pretty much every string. And for functions which
 take multiple strings, you can test with a combination of string types.
 It'll make sure that the functions work appropriately with the various
 types. The functions are templated, but sometimes something gets missed
 and they don't work with a particular character size or level of
 mutability. So, while it might be overkill to test _every_ combination,
 verifying at least some of them would be desirable.

Actually, I do this in most unittests, although not as systematically as 
what you suggest.  Look at the baseName unittests, for example:  
"file.ext"w is a wstring, "file.ext"d is a dstring and "file"w.dup is a 
wchar[].  The only thing I'm not testing is const(T)[].  Maybe your 
approach would be better.


 Personally, I think that it's better practice to test for empty rather
 than length == 0, but I don't suppose that it really matters given that
 we're always dealing with arrays here.

I agree.


 Your note on baseName about it adhering to the requirements for Posix's
 basename utility should probably put in the actual documentation. The
 same for dirName.

I considered doing that, but I was afraid of alienating Windows 
programmers.  It's probably a good idea.


 There's no point in checking for "== null" like you do in a few places.
 If you want to check whether it's actually null, you need to check for
 "is null", and if you want to check whether it's empty, then you should
 just use empty. "== null" has a tendancy to be misleading, because
 people tend to think that it actually checks whether an array is null,
 which it doesn't do. Personally, I'm likely to think that it's a bug
 when I see it.

Agreed.  I'll fix that.


 It would probably be more efficient for setExtension to use ~=. e.g.
 
 auto retval = stripExtension(path);
 retval ~= '.';
 retval ~= ext;
 return cast(typeof(return))(retval);

 Maybe it doesn't matter (and the code _is_  a bit less clean that way),
 but it might reduce the number of necessary heap allocations. The same
 goes for pretty much anywhere else that ~ is used. It probably doesn't
 matter all that much, but if we want to get the functions as efficient
 as possible, then it's better to use ~= than ~ unless the compiler is
 actually smart enough to tranlate the ~'s into ~='s, but I doubt that it
 is (particularly since some chunk of that is in druntime rather than in
 the compiler itself).

Yeah, David pointed this out as well.

 
 There's no such thing as an OutOfMemoryException. It's an
 OutOfMemoryError. And being an error, I don't think that there's any
 need to mention it in the documentation. _Anything_ which allocates
 memory could cause the application to run out of memory, and being an
 Error, you're really not supposed to catch it, so there's no real point
 in mentioning it in the documentation.

I'll fix the documentation. 

 
 Okay. I think that that covers it, though I might have missed something.
 Aside from a few nitpicks, my main issue with the module as it stands is
 the unnecessarily long function names. The function names should be as
 long as is necessary to make them clear but not longer. And putting
 Extension and Separator in names is completely unnecessary. Ext and Sep
 are quite clear enough (especially in context) and are _much_ shorter.
 The longer names add nothing and make the functions/variables much more
 annoying to deal with in longer expressions.
 
 So overall, I think that it looks good and is a definite improvement,
 but I really think that the function names need to be shortened.

Thanks for a thorough review! :)

-Lars

Jul 16 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 16 July 2011 19:39:13 Lars T. Kyllingstad wrote:
 On Fri, 15 Jul 2011 20:08:03 -0700, Jonathan M Davis wrote:
 Names
 -----------
 dirSeparator:
 I'd love to see this as something much shorter like dirSep. I think that
 dirSeparator is overly long - particularly for something which is likely
 to be used primarily in longer expressions.
 
 currentDirSymbol and parentDirSymbol: The same here. These are _way_
 longer than the strings that they represent. If we have them, they
 should be much shorter (e.g. currDirSym and parDirSym). But really, I
 think that I have to concur with dsimcha on this one. The strings for
 these are completely standard, easy to remember, and even shorter names
 for them would still be longer than the actual strings by a fair margin.
 We should just toss them.
 
 isDirSeparator:
 I'd love to see this shortened to isDirSep.
 
 extension:
 I'd love to see this shortened to ext.
 
 stripExtension:
 I'd love to see this shortened to stripExt.

 
 This has been discussed before, and even voted upon.  While the voting
 was very informal, with hardly enough participants to claim statistical
 significance, it at least showed that the current names have (perhaps
 even majority) support in the community:
 
 http://www.digitalmars.com/d/archives/digitalmars/D/
 Proposal_for_std.path_replacement_131121.html#N131332

Well, looking at that, it looks like it was pretty much a toss up on most of 
those, with them being one vote off from one another. There just isn't a large 
enough sample there to say much in most cases. It could easily go either way 
if a decent number of people voted on it. Personally, I'm definitely against 
putting Extension and Separator in the names because it makes them much longer 
without really adding any information.

 setExtensions:
 I think that it should have replace in its name rather than set, since
 it's not altering the original string. It's replacing what's there, not
 setting it. And again, I think that Extension should be  shortened to
 Ext, so the renamed function would be replaceExt.

 
 To me, "replace" has an even stronger connotation than "set" that the
 original string is being modified.  Even so, that's actually not the
 reason I chose "set".  Rather, I think that
 
 - replaceExtension implies that an existing extension gets replaced,
 - addExtension (the current std.path uses addExt) implies that an
   extension gets added -- appended, that is -- regardless of whether
   one exists already, while
 - setExtension implies that the extension is set to the given value
   whether the path already has one or not.
 
 Maybe that's just me, but at least I'd like to get more opinions before
 changing it.

Well, it's certainly debatable, but I don't like the idea of having set in a 
function which isn't altering its arguments. It is a valid point about replace 
impliing that an extension exists, but you could also argue that every file has 
an extension but that some have an empty extension, so replace is replacing 
the empty one - though as it stands if you try and do the reverse and give 
setExtension an empty extension, you end up with "." on the end instead of 
nothing.

So, anyway, it's certainly debatable, but personally, I think that set 
shouldn't be used on functions which don't alter at least one of their 
arguments.

 defaultExtension:
 This function is not a property, so it probably shouldn't be a noun,
 though I'm not quite sure what to call it. Technically, it's something
 like setWithDefaultExtensionIfNoExtension, but that's obviously a
 horrible name even if you abbreviate some of it. So, I don't really have
 a better name, though if it's not gonig to be completely renamed, I
 definitely think that it should be shortened to defaultExt rather than
 having the full Extension in its name.
 
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When
 discussing other modules such as as std.ascii and std.uni, we decided
 that they should have the same function names and _not_ put the module
 name in their names. Now, this function _is_ joining paths, so it's not
 purely a case of having the module's name in the function name, but I'm
 still not sure that we want to have it as joinPath instead of join.

 
 Nick expressed my feelings about this naming choice pretty well.  My main
 reason for not going with join() here is that this function does
 something very different from the other join().

Actually, it really isn't all that much different from std.array.join. It's 
just that it's always using a specific separator, and unlike std.array.join, it 
doesn't add the separator if it's already there. So, it wouldn't be at all 
amiss to call it just join. But I can understand not wanting the two to have 
the same name. As I said, I'm a bit divided on it. In general, we shouldn't be 
putting the module's naming in a function's name, but at the same time, 
joinPath _is_ joining paths rather than path just being the module name. I 
brought it up because it does fall in the camp of functions with the same name 
with essentially the same functionality (albeit tweaked for their particular 
module and use case), and it was decided that in the general case, such 
functions should have the same name and let the module system be used to 
distinguish them. But while that's what we should be doing in the general 
case, that doesn't necessarily mean that we _have_ to do it in every case. The 
pros and cons must be weighed with a major reason to use the same naming being 
that that's the general convention. But if there's enough reason to use a 
different name, then we can use a different name. I don't feel strongly about
it 
in either direction in this particular case. I think that it could go either 
way. I just felt that it needed to be brought up.

 absolutePath:
 It really shouldn't be a noun, since it's not a property. Something like
 makeAbsolute would be better.

 
 I originally called it toAbsolute(), but someone complained about that,
 too.  Sigh.  When it comes to function names, it is very hard to please
 everyone.

Well, I'm not terribly hung up on what the exact name is, but in general, 
variables, properties, and types should be nouns whereas functions should be 
verbs. absolutePath is a normal function, but its name is a noun. And as such, 
I think that makeAbsolute or toAbsolute makes a lot more sense. If other 
people think that path should be in the name for clarity or whatnot, then it 
could be change to toAbsPath or something similar.

 Implementation
 ------------------------
 You should probably consider testing the various functions with varying
 string types - both in size and mutability. Just put the test in a
 foreach loop like
 
 foreach (S; TypeTuple!(string, wstring, dstring))
 
 (but with whatever combination of string types you want to test; In the
 ideal case it would be every possible combination of character type and
 mutability, but that's not necessarily necessary). Then you'll probably
 end up using to!S() on pretty much every string. And for functions which
 take multiple strings, you can test with a combination of string types.
 It'll make sure that the functions work appropriately with the various
 types. The functions are templated, but sometimes something gets missed
 and they don't work with a particular character size or level of
 mutability. So, while it might be overkill to test _every_ combination,
 verifying at least some of them would be desirable.

 
 Actually, I do this in most unittests, although not as systematically as
 what you suggest.  Look at the baseName unittests, for example:
 "file.ext"w is a wstring, "file.ext"d is a dstring and "file"w.dup is a
 wchar[].  The only thing I'm not testing is const(T)[].  Maybe your
 approach would be better.
 
 Personally, I think that it's better practice to test for empty rather
 than length == 0, but I don't suppose that it really matters given that
 we're always dealing with arrays here.

 
 I agree.
 
 Your note on baseName about it adhering to the requirements for Posix's
 basename utility should probably put in the actual documentation. The
 same for dirName.

 
 I considered doing that, but I was afraid of alienating Windows
 programmers.  It's probably a good idea.

It makes it clearer what it's doing, and unless Windows has an equivalent 
function, then it's not like you're choosing the functionality of one over the 
other. And from the little poking around I just did, it seems that Windows 
does _not_ have an equivalent function (apparently you have to use something 
like _splitpath). So, I think that it's just better to put in tho docs. It 
makes it clearer, and if the link has good info, then it's that much better.

 There's no point in checking for "== null" like you do in a few places.
 If you want to check whether it's actually null, you need to check for
 "is null", and if you want to check whether it's empty, then you should
 just use empty. "== null" has a tendancy to be misleading, because
 people tend to think that it actually checks whether an array is null,
 which it doesn't do. Personally, I'm likely to think that it's a bug
 when I see it.

 
 Agreed.  I'll fix that.
 
 It would probably be more efficient for setExtension to use ~=. e.g.
 
 auto retval = stripExtension(path);
 retval ~= '.';
 retval ~= ext;
 return cast(typeof(return))(retval);
 
 Maybe it doesn't matter (and the code _is_  a bit less clean that way),
 but it might reduce the number of necessary heap allocations. The same
 goes for pretty much anywhere else that ~ is used. It probably doesn't
 matter all that much, but if we want to get the functions as efficient
 as possible, then it's better to use ~= than ~ unless the compiler is
 actually smart enough to tranlate the ~'s into ~='s, but I doubt that it
 is (particularly since some chunk of that is in druntime rather than in
 the compiler itself).

 
 Yeah, David pointed this out as well.
 
 There's no such thing as an OutOfMemoryException. It's an
 OutOfMemoryError. And being an error, I don't think that there's any
 need to mention it in the documentation. _Anything_ which allocates
 memory could cause the application to run out of memory, and being an
 Error, you're really not supposed to catch it, so there's no real point
 in mentioning it in the documentation.

 
 I'll fix the documentation.

Actually, core.memory needs to be fixed on that count (that's probably where 
you got the idea about OutOfMemoryException). I should probably create a pull 
request to fix that.

- Jonathan M Davis

Jul 16 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sat, 16 Jul 2011 14:25:36 -0700, Jonathan M Davis wrote:

 There's no such thing as an OutOfMemoryException. It's an
 OutOfMemoryError. And being an error, I don't think that there's any
 need to mention it in the documentation. _Anything_ which allocates
 memory could cause the application to run out of memory, and being an
 Error, you're really not supposed to catch it, so there's no real
 point in mentioning it in the documentation.

 
 I'll fix the documentation.

 
 Actually, core.memory needs to be fixed on that count (that's probably
 where you got the idea about OutOfMemoryException). I should probably
 create a pull request to fix that.

That function (expandTilde) and its documentation is from the current 
std.path, dating back to D1.  I haven't made any significant changes to 
it.  What it does is call druntime's onOutOfMemoryError() handler, which 
I guess throws an OutOfMemoryException in D1(?)

-Lars

Jul 17 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I'd rather have `extension` renamed to getExt. `ext` is used way too
often in user-code.

Jul 15 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 16 July 2011 05:34:06 Andrej Mitrovic wrote:
 I'd rather have `extension` renamed to getExt. `ext` is used way too
 often in user-code.

extension is a bit iffy, since it's just the one word. So, I'm not as concerned 
about that one (and I'd rather have it be a property than a have it be the 
normal function getExt). But all of the others where Extension is just part of 
a longer name just don't need to be that long. The combination of the name and 
its usage should make it as clear with just Ext as it is with Extension, and 
the shorter names are much more pleasant to deal with - especially since (in 
my experience at least) the various std.path functions get used together in 
larger expressions such that the longer names are definitely an issue.

So, while I'm not altogether fond of extension, it's fine with me if it stays 
as extension. But in the cases where Extension is just part of a longer name, 
I'd _really_ like to see them shortened to Ext.

- Jonathan M Davis

Jul 15 2011

torhu <no spam.invalid> writes:

On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28.  This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

Looks nice and clean, both docs and code!  I like modules that are not 
overengineered :)

I noticed a couple of things that I don't think have been mentioned already:

The docs for defaultExtension say that there's one case where doesn't 
allocate.  But the code says that it always does (.idup isn't 
conditional, it's just .dup with a different return type).

joinPath also uses .idup, thus always allocates.  Maybe there could be a 
documented allocation policy for the module as a whole?  Copy-on-write 
used to be the general rule for Phobos, don't know if that's true 
anymore.  For most of the functions in this module, even a single heap 
allocation could be more expensive than the rest of what they do.

I've read the discussions about the function names, but I still dare to 
make a suggestion of my own.  I feel like joinPath is a bit iffy, 
shouldn't it be plural somehow?  What it does is to create a path by 
putting pieces together, some paths in their own right, some not.  My 
suggestion is to simply call it buildPath.

Jul 17 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

 On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28.  This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/


std_path.html

 Looks nice and clean, both docs and code!  I like modules that are not
 overengineered :)

Thanks!

 
 I noticed a couple of things that I don't think have been mentioned
 already:
 
 The docs for defaultExtension say that there's one case where doesn't
 allocate.  But the code says that it always does (.idup isn't
 conditional, it's just .dup with a different return type).

There is no reason for idup to duplicate an array which is already 
immutable.  If it does, I'd say it is a compiler bug.
 

 joinPath also uses .idup, thus always allocates.  Maybe there could be a
 documented allocation policy for the module as a whole?  Copy-on-write
 used to be the general rule for Phobos, don't know if that's true
 anymore.  For most of the functions in this module, even a single heap
 allocation could be more expensive than the rest of what they do.

True.  I've tried keeping allocations to a minimum.  If you see other 
places where an allocation could be avoided (besides the idups), please 
let me know.

I'm not sure what you mean by copy-on-write in this case.  The functions 
in this module never modify the input arrays.


 I've read the discussions about the function names, but I still dare to
 make a suggestion of my own.  I feel like joinPath is a bit iffy,
 shouldn't it be plural somehow?  What it does is to create a path by
 putting pieces together, some paths in their own right, some not.  My
 suggestion is to simply call it buildPath.

Good point.  joinPaths would be another option.

-Lars

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Jul 2011 09:16:35 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

 On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28.  This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/


 std_path.html

 Looks nice and clean, both docs and code!  I like modules that are not
 overengineered :)

 Thanks!


 I noticed a couple of things that I don't think have been mentioned
 already:

 The docs for defaultExtension say that there's one case where doesn't
 allocate.  But the code says that it always does (.idup isn't
 conditional, it's just .dup with a different return type).

 There is no reason for idup to duplicate an array which is already
 immutable.  If it does, I'd say it is a compiler bug.

No, it has to duplicate.  Part of idup is saying "I want an immutable  
return" and the other part is saying "I want a duplicate"

For example:

string x = "hello".idup;
string y = x;

assert(x.capacity > 0); // would fail if idup just returned the same thing
x ~= " world";

assert(y is x); // would fail.

The compiler cannot know whether you care or not.

If you don't care, why are you calling idup on an immutable array?  Just  
use it.

-Steve

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 10:04:26 -0400, Steven Schveighoffer wrote:

 On Mon, 18 Jul 2011 09:16:35 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

 On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at
 the ends two weeks from now on July 28.  This will be followed by a
 week of voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/


 std_path.html

 Looks nice and clean, both docs and code!  I like modules that are not
 overengineered :)

 Thanks!


 I noticed a couple of things that I don't think have been mentioned
 already:

 The docs for defaultExtension say that there's one case where doesn't
 allocate.  But the code says that it always does (.idup isn't
 conditional, it's just .dup with a different return type).

 There is no reason for idup to duplicate an array which is already
 immutable.  If it does, I'd say it is a compiler bug.

 
 No, it has to duplicate.  Part of idup is saying "I want an immutable
 return" and the other part is saying "I want a duplicate"
 
 For example:
 
 string x = "hello".idup;
 string y = x;
 
 assert(x.capacity > 0); // would fail if idup just returned the same
 thing x ~= " world";
 
 assert(y is x); // would fail.
 
 The compiler cannot know whether you care or not.

Ah, ok.


 If you don't care, why are you calling idup on an immutable array?  Just
 use it.

The specific cases in question are templated functions, where I don't 
know whether the array is immutable unless I specifically test it.  I 
simply used arr.idup, assuming the compiler would optimise away the 
duplication when arr is immutable.  But it's no big deal to change it to

   static if (!is(T == immutable)) return arr.idup;

-Lars

Jul 18 2011

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:j01h2f$2ia$2 digitalmars.com...
 The specific cases in question are templated functions, where I don't
 know whether the array is immutable unless I specifically test it.  I
 simply used arr.idup, assuming the compiler would optimise away the
 duplication when arr is immutable.  But it's no big deal to change it to

   static if (!is(T == immutable)) return arr.idup;

 -Lars

Doesn't std.conv.to do this for you?

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Tue, 19 Jul 2011 00:57:44 +1000, Daniel Murphy wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:j01h2f$2ia$2 digitalmars.com...
 The specific cases in question are templated functions, where I don't
 know whether the array is immutable unless I specifically test it.  I
 simply used arr.idup, assuming the compiler would optimise away the
 duplication when arr is immutable.  But it's no big deal to change it
 to

   static if (!is(T == immutable)) return arr.idup;

 -Lars

 
 Doesn't std.conv.to do this for you?

It probably does!  I hadn't thought about that, but I'll check it out 
now.  Thanks!

-Lars

Jul 18 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On 2011-07-18 08:08, Lars T. Kyllingstad wrote:
 On Tue, 19 Jul 2011 00:57:44 +1000, Daniel Murphy wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:j01h2f$2ia$2 digitalmars.com...
 
 The specific cases in question are templated functions, where I don't
 know whether the array is immutable unless I specifically test it. I
 simply used arr.idup, assuming the compiler would optimise away the
 duplication when arr is immutable. But it's no big deal to change it
 to
 
 static if (!is(T == immutable)) return arr.idup;
 
 -Lars

 
 Doesn't std.conv.to do this for you?

 
 It probably does! I hadn't thought about that, but I'll check it out
 now. Thanks!

It does. That's why you should generally use std.conv.to instead of dup or 
idup directly. Really, dup and idup should only be used if you definitely 
_want_ a duplicate. Otherwise, std.conv.to will just (i)dup it when you 
actually need to.

- Jonathan M Davis

Jul 18 2011

torhu <no spam.invalid> writes:

On 18.07.2011 15:16, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

...
  joinPath also uses .idup, thus always allocates.  Maybe there could be a
  documented allocation policy for the module as a whole?  Copy-on-write
  used to be the general rule for Phobos, don't know if that's true
  anymore.  For most of the functions in this module, even a single heap
  allocation could be more expensive than the rest of what they do.

 True.  I've tried keeping allocations to a minimum.  If you see other
 places where an allocation could be avoided (besides the idups), please
 let me know.

 I'm not sure what you mean by copy-on-write in this case.  The functions
 in this module never modify the input arrays.

Sorry, don't know why I thought of COW in this case.  Don't heap 
allocate if you don't need to, was what I meant.  Which was because 
.idup was used, so nevermind that now :)

  I've read the discussions about the function names, but I still dare to
  make a suggestion of my own.  I feel like joinPath is a bit iffy,
  shouldn't it be plural somehow?  What it does is to create a path by
  putting pieces together, some paths in their own right, some not.  My
  suggestion is to simply call it buildPath.

 Good point.  joinPaths would be another option.

I didn't suggest joinPaths, because the inputs are not always paths, 
generally you join things like a path to a directory together with just 
a filename.  Maybe it's ok to call it all just 'paths', I don't know. 
But the output is commonly a path, hence buildPath.  Would be 
interesting to know what other people think, though.

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 20:47:20 +0200, torhu wrote:

 I didn't suggest joinPaths, because the inputs are not always paths,
 generally you join things like a path to a directory together with just
 a filename.  Maybe it's ok to call it all just 'paths', I don't know.
 But the output is commonly a path, hence buildPath.  Would be
 interesting to know what other people think, though.

See my answer to Steve.  I'm considering expanding joinPath's charter a 
bit, allowing it to take an input range of path segments and perform 
normalization while concatenating them.  buildPath would be a good name 
for this.

-Lars

Jul 20 2011

Ellery Newcomer <ellery-newcomer utulsa.edu> writes:

Looks nice.

a thought:
alternate data streams are fringe and probably don't warrant support (you might
look into them), but windows \\?\ paths probably should be supported in eg
joinPath and pathSplitter

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 16:48:29 +0000, Ellery Newcomer wrote:

 Looks nice.
 
 a thought:
 alternate data streams are fringe and probably don't warrant support
 (you might look into them), but windows \\?\ paths probably should be
 supported in eg joinPath and pathSplitter

I'll look into it. :)

-Lars

Jul 20 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Next in Review Queue: The New std.path