www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Next in Review Queue: The New std.path

reply dsimcha <dsimcha yahoo.com> writes:
Lars Kyllingstad's new and improved std.path module for Phobos is the 
next item up in the review queue.  I've volunteered to be the review 
manager.  Barring any objections, the review starts now and ends at the 
ends two weeks from now on July 28.  This will be followed by a week of 
voting, ending August 4th.

Code:
https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

Docs:
http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

Author's comments -- please read before reviewing:

First of all, note that I have rewritten almost everything from scratch.
Mainly, this is because there are a lot of corner cases which aren't
handled correctly by the current std.path, and in the end rewriting
turned out to be easier than updating.  (Check out the unittests; they
cover all the cases I could think of.)  The only functions I have kept
unchanged from the current std.path are pathCharMatch (formerly
fncharmatch), glob (formerly fnmatch) and expandTilde.

Secondly, I hope you will find the new function names to be both more
descriptive, more consistent and more in line with current Phobos
conventions.  They are the result of a (rather informal) round of
discussion and voting in this forum a few months ago (which I hope you
will consider before suggesting new names):

http://www.digitalmars.com/d/archives/digitalmars/D/Proposal_for_std.path_replacement_131121.html

Thirdly, I have applied  safe, pure and nothrow wherever possible.
There were cases where this was not possible, either due to bugs in DMD
(specifically, 5304, 5700 and 5798) or due to functions in other Phobos
modules not being properly marked with these attributes.  I have marked
these with //TODO comments and will fix them as soon as possible.

And finally, due to the deprecated: block at the bottom of the module,
this version should be backwards compatible with the current std.path
for a while yet.
Jul 14 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
...And now I'll do the honors and be the first to review.  Overall this 
looks solid and well documented but I have a few nitpicks and questions

1.  Do we really need currentDirSymbol and parentDirSymbol?  AFAIK 
they're standard across all operating systems that anyone cares about 
and everyone can easily remember them.  I have never used the 
equivalents in the old std.path and have never missed them.

2.  There's really no need to always allocate a new array on things like 
setExtension.  You could just append the extension in the case where the 
extension doesn't already exist and the string is immutable.  I'm not 
sure this is worth the extra complexity, though, given that setExtension 
will probably never be used in perf-critical code.

3.  All the stuff from the old std.path should be "scheduled for 
deprecation".  Deprecating stuff breaks people's build processes and 
should only be done after adequate warning.  I don't know a good way to 
implement this for enums and aliases, though, which I guess begs the 
question of whether DMD should support soft deprecation.

4.  This module might be useful at compile time.  As pointed out on this 
forum, Phobos should eventually include tests for CTFE-ability.  It 
would be nice if std.path had a few CTFE unit tests sprinkled in.

On 7/14/2011 8:20 PM, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue. I've volunteered to be the review
 manager. Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28. This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

 Author's comments -- please read before reviewing:

 First of all, note that I have rewritten almost everything from scratch.
 Mainly, this is because there are a lot of corner cases which aren't
 handled correctly by the current std.path, and in the end rewriting
 turned out to be easier than updating. (Check out the unittests; they
 cover all the cases I could think of.) The only functions I have kept
 unchanged from the current std.path are pathCharMatch (formerly
 fncharmatch), glob (formerly fnmatch) and expandTilde.

 Secondly, I hope you will find the new function names to be both more
 descriptive, more consistent and more in line with current Phobos
 conventions. They are the result of a (rather informal) round of
 discussion and voting in this forum a few months ago (which I hope you
 will consider before suggesting new names):

 http://www.digitalmars.com/d/archives/digitalmars/D/Proposal_for_std.path_replacement_131121.html


 Thirdly, I have applied  safe, pure and nothrow wherever possible.
 There were cases where this was not possible, either due to bugs in DMD
 (specifically, 5304, 5700 and 5798) or due to functions in other Phobos
 modules not being properly marked with these attributes. I have marked
 these with //TODO comments and will fix them as soon as possible.

 And finally, due to the deprecated: block at the bottom of the module,
 this version should be backwards compatible with the current std.path
 for a while yet.

Jul 14 2011
next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-07-14 17:33, dsimcha wrote:
 3. All the stuff from the old std.path should be "scheduled for
 deprecation". Deprecating stuff breaks people's build processes and
 should only be done after adequate warning. I don't know a good way to
 implement this for enums and aliases, though, which I guess begs the
 question of whether DMD should support soft deprecation.

A documentation note for it can and should be added to each function saying that it's scheduled for deprecation (std.ctype, std.string, and std.file all have some good examples of this). Unfortunately, the only way at this point to print a message about a function being scheduled for deprecation is to use a pragma inside of the function - which only works with templated functions. However, there's been a number of complaints on the Announce list about the messages which are currently printed for functions which are scheduled for deprecation, so I don't know how we want to handle such messages. At minimum, the old functions should be in the documentation and labeled as scheduled for deprecation rather than deprecated. - Jonathan M Davis
Jul 14 2011
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"dsimcha" <dsimcha yahoo.com> wrote in message 
news:ivo271$1fu0$1 digitalmars.com...
 3.  All the stuff from the old std.path should be "scheduled for 
 deprecation".  Deprecating stuff breaks people's build processes and 
 should only be done after adequate warning.  I don't know a good way to 
 implement this for enums and aliases, though, which I guess begs the 
 question of whether DMD should support soft deprecation.

Isn't that what -d is for?
 1.  Do we really need currentDirSymbol and parentDirSymbol?  AFAIK they're 
 standard across all operating systems that anyone cares about and everyone 
 can easily remember them.  I have never used the equivalents in the old 
 std.path and have never missed them.

 4.  This module might be useful at compile time.  As pointed out on this 
 forum, Phobos should eventually include tests for CTFE-ability.  It would 
 be nice if std.path had a few CTFE unit tests sprinkled in.

Seconded, on both #1 and #4.
Jul 15 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday 15 July 2011 04:13:35 Nick Sabalausky wrote:
 "dsimcha" <dsimcha yahoo.com> wrote in message
 news:ivo271$1fu0$1 digitalmars.com...
 
 3.  All the stuff from the old std.path should be "scheduled for
 deprecation".  Deprecating stuff breaks people's build processes and
 should only be done after adequate warning.  I don't know a good way to
 implement this for enums and aliases, though, which I guess begs the
 question of whether DMD should support soft deprecation.

Isn't that what -d is for?

The way that deprecation is supposed to work is in three stages: 1. Scheduled for deprecation. 2. Deprecated. 3. Removed. In stage 1, users are notified that a symbol is going to be deprecated, but it is not yet deprecated. This gives them time to alter their code accordingly before they will have to build with -d. In stage 2, the symbol is actually deprecated with the deprecated keyword, and -d is required to compile code which uses that symbol. If the user has not yet altered their code, then they will have to alter their build scripts to use -d. Finally, in stage 3, the deprecated symbol is removed. In stage 1, when a symbol is scheduled for deprecation, the documentation is altered to reflect that, and ideally a message would be printed out if it's used so that programmers will be better notified about the impending deprecation. However, since the only tool we have for that at the moment is pragmas, only entire modules and templated functions or types or types can have such messages (since otherwise the message would print regardless of whether you use the symbol). A number of complaints have arisen about these messages, however (a number were introduced in 2.054 - both for stuff which had already scheduled for deprecation and for stuff which was just scheduled for deprecation). So, we may end up ditching the pragma messages. I don't know. In either case, Walter has been pretty adamant about not breaking users' code without notice. So, immediately deprecating something (thereby breaking their code - even if all it takes is changing their build scripts to use -d) is not generally acceptable - hence the "scheduled for deprecation" phase. The current plan is that stage 1 and stage 2 both take 6 months each, but we may adjust that depending on what's being changed (e.g.g smaller changes or changes of recently added stuff may take less time). - Jonathan M Davis
Jul 15 2011
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 14 Jul 2011 20:33:20 -0400, dsimcha wrote:

 ...And now I'll do the honors and be the first to review.  Overall this
 looks solid and well documented but I have a few nitpicks and questions
 
 1.  Do we really need currentDirSymbol and parentDirSymbol?  AFAIK
 they're standard across all operating systems that anyone cares about
 and everyone can easily remember them.  I have never used the
 equivalents in the old std.path and have never missed them.

I agree. (I didn't really want to keep them, but I wasn't sure whether anyone used them.) If I don't hear any objections, I will remove currentDirSymbol and parentDirSymbol.
 2.  There's really no need to always allocate a new array on things like
 setExtension.  You could just append the extension in the case where the
 extension doesn't already exist and the string is immutable.  I'm not
 sure this is worth the extra complexity, though, given that setExtension
 will probably never be used in perf-critical code.

I guess it depends on what is the most common case. If "immutable without extension" is the most common case then it could be worth it. Otherwise it just adds code bloat for little gain.
 3.  All the stuff from the old std.path should be "scheduled for
 deprecation".  Deprecating stuff breaks people's build processes and
 should only be done after adequate warning.  I don't know a good way to
 implement this for enums and aliases, though, which I guess begs the
 question of whether DMD should support soft deprecation.

True. As Jonathan points out, there is ongoing discussion about how best to do deprecation. As has also been pointed out elsewhere, the old functions should be kept as they are, so behaviour doesn't silently change. I'll add them back in, and include "scheduled for deprecation" pragma(msg)s where possible.
 4.  This module might be useful at compile time.  As pointed out on this
 forum, Phobos should eventually include tests for CTFE-ability.  It
 would be nice if std.path had a few CTFE unit tests sprinkled in.

I'm not sure I agree with this. As I understand it, most of D should eventually be usable in CTFE. (Don has made enormous progress in this regard.) In most cases we therefore shouldn't have to change Phobos to make it CTFEable, it should Just Work. On that note, it shouldn't be necessary to add CTFE-specific unittests either. -Lars
Jul 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ivpk4e$gf5$1 digitalmars.com...
 4.  This module might be useful at compile time.  As pointed out on this
 forum, Phobos should eventually include tests for CTFE-ability.  It
 would be nice if std.path had a few CTFE unit tests sprinkled in.

I'm not sure I agree with this. As I understand it, most of D should eventually be usable in CTFE. (Don has made enormous progress in this regard.) In most cases we therefore shouldn't have to change Phobos to make it CTFEable, it should Just Work. On that note, it shouldn't be necessary to add CTFE-specific unittests either.

I totally understand that reasoning, but as I pointed out in the OP of the big CTFE-ability thread: We're currently in a situation where basic commonly-needed-in-CTFE functions are frequently breaking at compile-time, which forces D users to avoid using Phobos at compile time and reimplement any needed Phobos functions themselves. It would be better to just 1. Have unittests to detect breakages of at least the functions that are likely to be needed at compile-time, and 2. Insert quick-n-dirty CTFE-able alternate codepaths (via "if(_ctfe)") around any part of the affected functions that's broken at compile time. Plus, the unittests can then stay in to help detect CTFE regressions. In fact, one of the CTFE bugs I reported one time was a result of a CTFE-ability unittest I had for one of my functions. (Although, it was a function that was specifically written to be used at compile time...So it was really just a unittest to make sure it worked as intended...)
Jul 15 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 15:57:58 -0400, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ivpk4e$gf5$1 digitalmars.com...
 4.  This module might be useful at compile time.  As pointed out on
 this forum, Phobos should eventually include tests for CTFE-ability. 
 It would be nice if std.path had a few CTFE unit tests sprinkled in.

I'm not sure I agree with this. As I understand it, most of D should eventually be usable in CTFE. (Don has made enormous progress in this regard.) In most cases we therefore shouldn't have to change Phobos to make it CTFEable, it should Just Work. On that note, it shouldn't be necessary to add CTFE-specific unittests either.

the big CTFE-ability thread: We're currently in a situation where basic commonly-needed-in-CTFE functions are frequently breaking at compile-time, which forces D users to avoid using Phobos at compile time and reimplement any needed Phobos functions themselves. It would be better to just 1. Have unittests to detect breakages of at least the functions that are likely to be needed at compile-time, and 2. Insert quick-n-dirty CTFE-able alternate codepaths (via "if(_ctfe)") around any part of the affected functions that's broken at compile time. Plus, the unittests can then stay in to help detect CTFE regressions. In fact, one of the CTFE bugs I reported one time was a result of a CTFE-ability unittest I had for one of my functions. (Although, it was a function that was specifically written to be used at compile time...So it was really just a unittest to make sure it worked as intended...)

I didn't notice your CTFE-ability thread until now, because I was away on vacation when you posted it. I'll make sure to read it, maybe it will convince me. ;) -Lars
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Lars, is normalize basically that feature request (toNativePath) I
asked for? It does seem to do more than that though.
Jul 14 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
All functions have nice names except fcmp, why the abbreviation?

Is expandTilde supposed to work on Windows? I've never seen tilde used there.
Jul 14 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.1654.1310690711.14074.digitalmars-d puremagic.com...
 All functions have nice names except fcmp, why the abbreviation?

 Is expandTilde supposed to work on Windows? I've never seen tilde used 
 there.

The docs say that it does nothing on Windows: "For Windows, expandTilde() merely returns its argument inputPath." This is probably for the best because the Windows equivilent to "/home/user" can be one of a number of different locations depending on what exactly you're storing. If expandTilde did work on Windows, then no matter what it expanded to, it would be wrong at least half the time.
Jul 15 2011
parent Mafi <mafi example.org> writes:
Am 15.07.2011 10:22, schrieb Nick Sabalausky:
 "Andrej Mitrovic"<andrej.mitrovich gmail.com>  wrote in message
 news:mailman.1654.1310690711.14074.digitalmars-d puremagic.com...
 All functions have nice names except fcmp, why the abbreviation?

 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

The docs say that it does nothing on Windows: "For Windows, expandTilde() merely returns its argument inputPath." This is probably for the best because the Windows equivilent to "/home/user" can be one of a number of different locations depending on what exactly you're storing. If expandTilde did work on Windows, then no matter what it expanded to, it would be wrong at least half the time.

What about longname becoming longna~1 on dos/emulators etc. Expanding this back needs information about the actual state of the file system so I don't now if such a function belongs to std.path but a longname.png => longna~1.png belongs IMO to std.path .
Jul 15 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Steven Schveighoffer" <schveiguy yahoo.com> wrote in message 
news:op.vyoeirtveav7ka localhost.localdomain...
 On Fri, 15 Jul 2011 15:21:49 -0400, Nick Sabalausky <a a.a> wrote:

 Actually, if Windows did have one single equivalent to "/home/{user}", 
 then
 I think it would be absolutely fantastic to make tilde mean "home 
 directory"
 on Windows. But as things are, that's a moot point, of course.

The typical thing is to use the environment variable USERPROFILE, but that's an odd dependency for a string processing library. Windows also doesn't keep the same notion of home directory as Unix does. You can read more about it here: http://en.wikipedia.org/wiki/Environment_variable If that's the only function that uses environment variables, and you can't have the function without it, it might be worth having it. However, we must think about how that affects things like purity.

No, like I've said, all the possible choices are frequently wrong, and that includes USERPROFILE. If you're selecting a default directory for the user to save/load a word processing document, then USERPROFILE is wrong (should be the user's My Documents - or the last used directory, whatever). If you're storing per-user application-settings/internal-application-data, then USERPROFILE is wrong. It should be APPDATA if it's roaming data (should remain with the user on different computers). Or if it's machine-specific (non-roaming) data, then it belongs in whatever the env variable for "%APPDATA%\Local Settings\Application Data" is. So expanding tilde to USERPROFILE will just encourage people to do the wrong thing.
Jul 15 2011
next sibling parent David Nadlinger <see klickverbot.at> writes:
On 7/18/11 3:55 PM, Steven Schveighoffer wrote:
 In other words, expand more than just the tilde.

 My thoughts are, if we try and take a stab at making something
 intelligently decide where to store files, people will appreciate it
 more than having to search the web for a right answer to something that
 doesn't have one.

Just a quick idea: Maybe we could provide a set of standard directories (e.g. UserData.config, UserData.cache, …) along with a function which maps them to the the operating system defaults, i.e. the well-known directories on Windows, the $XDG_*_HOME ones on Linux, etc. David
Jul 18 2011
prev sibling parent torhu <no spam.invalid> writes:
On 18.07.2011 15:55, Steven Schveighoffer wrote:
...
 Certainly not having that function on Windows will encourage them to
 choose a wrong choice then, no?  I mean if they want to find a "home
 directory" and phobos doesn't support that, then they google online, find
 something like %userprofile%, use it, and maybe file a bug against phobos
 for good measure :)

 So there is no right answer, we should at least come up with *an* answer.
 In the Windows world of de-facto standards, we're bound to start a trend ;)

 What about this?

 ~/Application Data =>  %APPDATA%
 ~/Local Settings/Application Data =>  %LOCALAPPDATA% or %USERPROFILE%\Local
 Settings\Application Data
 etc.
 Last resort:
 ~ =>  %USERPROFILE%

 In other words, expand more than just the tilde.

Wouldn't that tend to cause the same issue? They use tilde on Linux, then get %userprofile% because they didn't change it when they port to Windows. If people know that they want %appdata%, they could just use that.
 My thoughts are, if we try and take a stab at making something
 intelligently decide where to store files, people will appreciate it more
 than having to search the web for a right answer to something that doesn't
 have one.

That could be done, here's an example I'm familiar with: http://www.allegro.cc/manual/5/al_get_standard_path This is designed to work across all the major platforms. Some paths will be the same on some platforms. ALLEGRO_USER_DOCUMENTS_PATH, ALLEGRO_USER_DATA_PATH, and ALLEGRO_USER_SETTINGS_PATH are probably the most relevant ones for this discussion.
Jul 18 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Just few comments to the source code. I have not used it yet.

version (Windows) private alias Signed!size_t ssize_t;

Why not just ptrdiff_t ?

-------------------

In theory this too is (in C[] path) but I presume you have found troubles:
private C[] chompDirSeparators(C)(C[] path)   safe pure nothrow

-------------------

So here you have had to use Unqual
immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext)
immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[] path, in C2[] ext)

Instead of Unqual isn't it nicer to use a Deconst!() template?

-------------------

// Detects whether the given types are all string types of the same width
private template Strings...)  if (Strings.length > 0)

I think that a CTFE function with a static foreach is better/faster/simpler
than the recursive compatibleStrings() template.

-------------------

unittest
{

I suggest to add a comment that tells what function/class/struct is tested in
this unittest (this is a poor's man named unittest).

Bye,
bearophile
Jul 14 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 14 Jul 2011 21:00:26 -0400, bearophile wrote:

 Just few comments to the source code. I have not used it yet.
 
 version (Windows) private alias Signed!size_t ssize_t;
 
 Why not just ptrdiff_t ?

I find ssize_t (which is standard on POSIX) to be much more descriptive of its purpose. http://en.wikipedia.org/wiki/Size_t
 In theory this too is (in C[] path) but I presume you have found
 troubles: private C[] chompDirSeparators(C)(C[] path)   safe pure
 nothrow

The problem is that path will then be const(C[]), and it becomes impossible to return a slice of it as C[]. The correct solution is to use inout, and I intend to do so once inout is correctly implemented in DMD.
 So here you have had to use Unqual
 immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext)
 immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[] path, in C2[]
 ext)
 
 Instead of Unqual isn't it nicer to use a Deconst!() template?

Hmm, I guess you're right. "shared" shouldn't be stripped, for instance.
 // Detects whether the given types are all string types of the same
 width private template Strings...)  if (Strings.length > 0)
 
 I think that a CTFE function with a static foreach is
 better/faster/simpler than the recursive compatibleStrings() template.

Maybe, but I also find it uglier in use due to the extra parentheses. if (compatibleStrings!(T, U, V)()) Personally, I don't think it's worth changing it.
 unittest
 {
 
 I suggest to add a comment that tells what function/class/struct is
 tested in this unittest (this is a poor's man named unittest).

I don't think that is necessary. Looking at the unittests in this module, I think it is rather obvious what is being tested -- not to mention that the unittests always follow directly after the function being tested. -Lars
Jul 15 2011
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
  So here you have had to use Unqual
  immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext)
  immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[] path, in C2[]
  ext)

  Instead of Unqual isn't it nicer to use a Deconst!() template?



Given that immutable( const(char) ) == immutable(char), I think the Unqual! should simply be removed.
Jul 15 2011
parent reply KennyTM~ <kennytm gmail.com> writes:
On Jul 16, 11 00:05, Jonathan M Davis wrote:
 On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
   So here you have had to use Unqual
   immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in
   C2[] ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in
   C1[] path, in C2[] ext)

   Instead of Unqual isn't it nicer to use a Deconst!() template?


Hmm, I guess you're right. "shared" shouldn't be stripped, for instance.

Given that immutable( const(char) ) == immutable(char), I think the Unqual! should simply be removed.

I'd still put the Unqual in there. Perhaps it's due to compiler bugs, but from what I've seen, it can get kind of funny when you try and have an immutable const or a const immutable. Using Unqual makes it very clear what you mean. - Jonathan M Davis

OK. But I think you should file the compiler bug :) From what I see, it's that 'immutable' always win. -------------------------- alias const(char) CC; alias immutable(char) IC; alias immutable(CC) ICC; alias const(IC) CIC; pragma(msg, ICC); pragma(msg, CIC); -------------------------- immutable(char) immutable(char) --------------------------
Jul 15 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Jul 16, 11 04:16, Lars T. Kyllingstad wrote:
 On Sat, 16 Jul 2011 00:17:03 +0800, KennyTM~ wrote:

 On Jul 16, 11 00:05, Jonathan M Davis wrote:
 On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
    So here you have had to use Unqual
    immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[]
    ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[]
    path, in C2[] ext)

    Instead of Unqual isn't it nicer to use a Deconst!() template?


Hmm, I guess you're right. "shared" shouldn't be stripped, for instance.

Given that immutable( const(char) ) == immutable(char), I think the Unqual! should simply be removed.

I'd still put the Unqual in there. Perhaps it's due to compiler bugs, but from what I've seen, it can get kind of funny when you try and have an immutable const or a const immutable. Using Unqual makes it very clear what you mean. - Jonathan M Davis

OK. But I think you should file the compiler bug :) From what I see, it's that 'immutable' always win. -------------------------- alias const(char) CC; alias immutable(char) IC; alias immutable(CC) ICC; alias const(IC) CIC; pragma(msg, ICC); pragma(msg, CIC); -------------------------- immutable(char) immutable(char) --------------------------

True, but this doesn't apply to the present case. Since the parameters are marked with 'in', they become const(immutable(char)[]), not const (immutable(char))[]. This isn't too hard to fix, but I prefer to use Unqual, Deconst, Mutable, or whatever it ends up being called. -Lars

Even if `typeof(path)` becomes `const(string)`, C1 is still an `immutable(char)`, so `immutable(C1)[]` will still work. ------------------------------------ immutable(C1)[] setExtension(C1, C2)(in C1[] path, in C2[] ext) { pragma(msg, typeof(return), " <- ", C1); return typeof(return).init; } void main() { setExtension("1", "2"); setExtension("3".dup, "4".dup); setExtension(cast(const)"5".dup, cast(const)"6".dup); } ------------------------------------ string <- immutable(char) string <- char string <- const(char) ------------------------------------
Jul 15 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
  So here you have had to use Unqual
  immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in
  C2[] ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in
  C1[] path, in C2[] ext)
  
  Instead of Unqual isn't it nicer to use a Deconst!() template?


Hmm, I guess you're right. "shared" shouldn't be stripped, for instance.

Given that immutable( const(char) ) == immutable(char), I think the Unqual! should simply be removed.

I'd still put the Unqual in there. Perhaps it's due to compiler bugs, but from what I've seen, it can get kind of funny when you try and have an immutable const or a const immutable. Using Unqual makes it very clear what you mean. - Jonathan M Davis
Jul 15 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sat, 16 Jul 2011 00:17:03 +0800, KennyTM~ wrote:

 On Jul 16, 11 00:05, Jonathan M Davis wrote:
 On Friday 15 July 2011 23:48:39 KennyTM~ wrote:
 On Jul 15, 11 23:26, Lars T. Kyllingstad wrote:
   So here you have had to use Unqual
   immutable(Unqual!C1)[] setExtension(C1, C2)(in C1[] path, in C2[]
   ext) immutable(Unqual!C1)[] defaultExtension(C1, C2)(in C1[]
   path, in C2[] ext)

   Instead of Unqual isn't it nicer to use a Deconst!() template?


Hmm, I guess you're right. "shared" shouldn't be stripped, for instance.

Given that immutable( const(char) ) == immutable(char), I think the Unqual! should simply be removed.

I'd still put the Unqual in there. Perhaps it's due to compiler bugs, but from what I've seen, it can get kind of funny when you try and have an immutable const or a const immutable. Using Unqual makes it very clear what you mean. - Jonathan M Davis

OK. But I think you should file the compiler bug :) From what I see, it's that 'immutable' always win. -------------------------- alias const(char) CC; alias immutable(char) IC; alias immutable(CC) ICC; alias const(IC) CIC; pragma(msg, ICC); pragma(msg, CIC); -------------------------- immutable(char) immutable(char) --------------------------

True, but this doesn't apply to the present case. Since the parameters are marked with 'in', they become const(immutable(char)[]), not const (immutable(char))[]. This isn't too hard to fix, but I prefer to use Unqual, Deconst, Mutable, or whatever it ends up being called. -Lars
Jul 15 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
dsimcha:

 Author's comments -- please read before reviewing:

 Thirdly, I have applied  safe, pure and nothrow wherever possible.
 There were cases where this was not possible, either due to bugs in DMD
 (specifically, 5304, 5700 and 5798) or due to functions in other Phobos
 modules not being properly marked with these attributes.  I have marked
 these with //TODO comments and will fix them as soon as possible.

Adding pure and nothrow to functions in other Phobos modules is a good thing. There are functions like to!int("12") that to become pure require some other functions to become pure first. I think that currently functions that contain enforce can't be pure. Bye, bearophile
Jul 14 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Jul 15, 11 10:22, bearophile wrote:
 dsimcha:

 Author's comments -- please read before reviewing:

 Thirdly, I have applied  safe, pure and nothrow wherever possible.
 There were cases where this was not possible, either due to bugs in DMD
 (specifically, 5304, 5700 and 5798) or due to functions in other Phobos
 modules not being properly marked with these attributes.  I have marked
 these with //TODO comments and will fix them as soon as possible.

Adding pure and nothrow to functions in other Phobos modules is a good thing. There are functions like to!int("12") that to become pure require some other functions to become pure first. I think that currently functions that contain enforce can't be pure. Bye, bearophile

The 'enforce' problem is simple. It could be easier replaced by the 'if(...) throw' construct when not composing, and Don had made a patch for bug 5750 to make it pure also. The big problem is GC.malloc (a.k.a. std.array.uninitializedArray) and other GC functions (a.k.a. std.array.appender). They cannot be easily replaced without sacrificing performance. (I believe all these allocator functions should be made weakly-pure (bug 6151), but so far there isn't official decision). to!int("12") is not pure because of: - std.array.front and popFront, because of: - std.utf.decode (Phobos pull #80 was rejected), because of: - std.exception.enforce (bug 5750 / DMD pull #227) - std.conv.to!(string, size_t), because of: - GC.malloc, or std.array.uninitializedArray if Phobos pull #144 was accepted - std.conv.to!(string, const(ubyte)[]), because of: - std.conv.to!(string, ubyte), because of std.conv.to!(string, uint) - std.array.appender, because of: - GC.extend - GC.qalloc - memcpy (just need a 'pure' annotation) * All of the above are only because of the error-reporting facility. The main content of std.utf.decode is pure and safe. In particular, I don't think it's proper to throw the whole string's encoding back to the user. - ConvOverflowException.raise, just lacking a 'pure' annotation - convError, because of std.conv.to!(string, uint). - bug 6265 (DMD pull #222), which prevents 'to' and 'toImpl' to be pure even if the function body is all pure.
Jul 15 2011
prev sibling next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
IANAL, but there should be mention of the Boost license somewhere.

ssize_t (Line 50):     Why not use sizediff_t?
Line 58..67: I think version blocks should be like

     version (Windows)
     {
       ...
     }
     else version (Posix)
     {
       ...
     }
     else
        static assert(0, "unsupported platform");


pathSplitter: Is it possible to make it a bidirectional range (implement 
.back and .popBack())?

pathCharMatch: On OS X the file system is case-insensitive by default. 
Also, I'm not sure if restricting to ASCII is fine. Actually the 
case-sensitivity is independent of the platform, e.g. you can configure 
a case-sensitive disk on Windows.
Jul 15 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"KennyTM~" <kennytm gmail.com> wrote in message 
news:ivosru$2v11$1 digitalmars.com...
 pathCharMatch: On OS X the file system is case-insensitive by default. 
 Also, I'm not sure if restricting to ASCII is fine. Actually the 
 case-sensitivity is independent of the platform, e.g. you can configure a 
 case-sensitive disk on Windows.

Fun! /me shoots self
Jul 15 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-07-15 10:08, KennyTM~ wrote:
 IANAL, but there should be mention of the Boost license somewhere.

 ssize_t (Line 50): Why not use sizediff_t?
 Line 58..67: I think version blocks should be like

 version (Windows)
 {
 ...
 }
 else version (Posix)
 {
 ...
 }
 else
 static assert(0, "unsupported platform");


 pathSplitter: Is it possible to make it a bidirectional range (implement
 .back and .popBack())?

 pathCharMatch: On OS X the file system is case-insensitive by default.
 Also, I'm not sure if restricting to ASCII is fine. Actually the
 case-sensitivity is independent of the platform, e.g. you can configure
 a case-sensitive disk on Windows.

It's possible to do that on Mac OS X as well. -- /Jacob Carlborg
Jul 15 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 16:08:37 +0800, KennyTM~ wrote:

 IANAL, but there should be mention of the Boost license somewhere.

Oops, seems I forgot.
 ssize_t (Line 50):     Why not use sizediff_t?

Mainly because I wasn't aware of its existence. ;) I'll fix. (I still think ssize_t is a better name, though, and now that you mention it, I recall Andrei and Walter discussing this on the Phobos mailing list a few months ago.)
 Line 58..67: I think
 version blocks should be like
 
      version (Windows)
      {
        ...
      }
      else version (Posix)
      {
        ...
      }
      else
         static assert(0, "unsupported platform");

Good point.
 pathSplitter: Is it possible to make it a bidirectional range (implement
 .back and .popBack())?

It is probably possible, but I don't think the increased complexity will be worth it. Feel free to try to change my mind, though. ;)
 pathCharMatch: On OS X the file system is case-insensitive by default.
 Also, I'm not sure if restricting to ASCII is fine. Actually the
 case-sensitivity is independent of the platform, e.g. you can configure
 a case-sensitive disk on Windows.

Hmm.. how about I add an optional caseSensitive parameter (to both pathCharMatch and glob) that defaults to false on Windows and true on POSIX? -Lars
Jul 15 2011
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"dsimcha" <dsimcha yahoo.com> wrote in message 
news:ivo1ef$1ero$1 digitalmars.com...
 Lars Kyllingstad's new and improved std.path module for Phobos is the next 
 item up in the review queue.

Yay! First of all, very well-written docs :) My notes: - For clarity and completeness, the docs for joinPath should include the example: joinPath("foo", "bar"); It does already have joinPath(r"c:\foo", "bar"), but it's easy to overlook the fact that's demonstrating the same thing, especially if you're more interested in Posix. - For generic purposes, maybe joinPath should accept being called with just one argument? If it already does, then the "Joins two or more path components" should change to "Joins one or more path components" - Maybe there should be a function canonical() that resolves symbolic links, runs absolutePath() and normalize(), and on Windows converts to lowercase. Doesn't need to be in there right now though, could wait for a later release. - Docs don't say how pathCharMatch handles slash and backslash. - I really hate to bring up a function-naming issue at this point, but "glob" is commonly known to mean "Return all existing files/paths that match this pattern" (And that's something we should have at some point, btw, if we don't already). So maybe this glob should be called something like globMatch or matchGlob instead. - I don't think I like how glob works. It doesn't seem to know anything about directory separators, which seems unintuitive and problematic. I'd really like to see it work like this (basically from Ruby): assert( glob("dir/foo", "dir/*") ); assert( glob("dir/foo", "dir/**") ); assert( !glob("dir/foo/bar", "dir/*") ); assert( glob("dir/foo/bar", "dir/**") ); assert( glob("dir/foo/abc", "dir/*/abc") ); assert( glob("dir/foo/abc", "dir/**/abc") ); assert( !glob("dir/foo/bar/abc", "dir/*/abc") ); assert( glob("dir/foo/bar/abc", "dir/**/abc") ); assert( glob("dir/foo", "dir/f*o") ); assert( !glob("dir/faa/boo", "dir/f*o") ); As it is right now, how would you do a non-recursive star-match? Doesn't look like you even can.
Jul 15 2011
next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 04:33:34 -0400, Nick Sabalausky wrote:

 "dsimcha" <dsimcha yahoo.com> wrote in message
 news:ivo1ef$1ero$1 digitalmars.com...
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

First of all, very well-written docs :)

Thanks! :)
 My notes:
 
 - For clarity and completeness, the docs for joinPath should include the
 example:
 
 joinPath("foo", "bar");
 
 It does already have joinPath(r"c:\foo", "bar"), but it's easy to
 overlook the fact that's demonstrating the same thing, especially if
 you're more interested in Posix.

Good point.
 - For generic purposes, maybe joinPath should accept being called with
 just one argument? If it already does, then the "Joins two or more path
 components" should change to "Joins one or more path components"

I'll look into it.
 - Maybe there should be a function canonical() that resolves symbolic
 links, runs absolutePath() and normalize(), and on Windows converts to
 lowercase. Doesn't need to be in there right now though, could wait for
 a later release.

I'd be more in favour of adding a resolveSymLinks function to std.file. In any case, I'd prefer it not be part of this proposal, and rather be added later.
 - Docs don't say how pathCharMatch handles slash and backslash.

True, I'll fix.
 - I really hate to bring up a function-naming issue at this point, but
 "glob" is commonly known to mean "Return all existing files/paths that
 match this pattern" (And that's something we should have at some point,
 btw, if we don't already). So maybe this glob should be called something
 like globMatch or matchGlob instead.
 
 - I don't think I like how glob works. It doesn't seem to know anything
 about directory separators, which seems unintuitive and problematic. I'd
 really like to see it work like this (basically from Ruby):
 
 assert( glob("dir/foo", "dir/*") );
 assert( glob("dir/foo", "dir/**") );
 
 assert( !glob("dir/foo/bar", "dir/*") ); assert( glob("dir/foo/bar",
 "dir/**") );
 
 assert( glob("dir/foo/abc", "dir/*/abc") ); assert( glob("dir/foo/abc",
 "dir/**/abc") );
 
 assert( !glob("dir/foo/bar/abc", "dir/*/abc") ); assert(
 glob("dir/foo/bar/abc", "dir/**/abc") );
 
 assert( glob("dir/foo", "dir/f*o") ); assert( !glob("dir/faa/boo",
 "dir/f*o") );
 
 As it is right now, how would you do a non-recursive star-match? Doesn't
 look like you even can.

I haven't done anything to glob (i.e. fnmatch in the current std.path) except change its name and improve its documentation. For all intents and purposes, it is not part of my submission. If I find the time, however, I'll look into improving it. -Lars
Jul 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ivpnr5$gf5$7 digitalmars.com...
 On Fri, 15 Jul 2011 04:33:34 -0400, Nick Sabalausky wrote:

 - Maybe there should be a function canonical() that resolves symbolic
 links, runs absolutePath() and normalize(), and on Windows converts to
 lowercase. Doesn't need to be in there right now though, could wait for
 a later release.

I'd be more in favour of adding a resolveSymLinks function to std.file. In any case, I'd prefer it not be part of this proposal, and rather be added later.

Oh, yea, I didn't mean to imply that there shouldn't be a resolveSymLinks function. But I strongly believe that a function which combines all of that (and maybe the 8.3 -> full-name expansion on Windows, and maybe expansions of things like %APPDATA%) and guarantees a single unique canonical name for each actual (existing or non-existing) file would be a very, very, very good thing to have. Essenital, in fact, for comparing filepaths. In any case, I agree that this can all wait until a later time and doesn't need to be in this proposal.
 I haven't done anything to glob (i.e. fnmatch in the current std.path)
 except change its name and improve its documentation.  For all intents
 and purposes, it is not part of my submission.  If I find the time,
 however, I'll look into improving it.

I see. In that case, maybe its name should remain fnmatch for now (or fnglob if you really want "glob" in there). That would prevent confuson both now and whenever it gets improved.
Jul 15 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 15:34:37 -0400, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ivpnr5$gf5$7 digitalmars.com...
 On Fri, 15 Jul 2011 04:33:34 -0400, Nick Sabalausky wrote:

 - Maybe there should be a function canonical() that resolves symbolic
 links, runs absolutePath() and normalize(), and on Windows converts to
 lowercase. Doesn't need to be in there right now though, could wait
 for a later release.

I'd be more in favour of adding a resolveSymLinks function to std.file. In any case, I'd prefer it not be part of this proposal, and rather be added later.

resolveSymLinks function. But I strongly believe that a function which combines all of that (and maybe the 8.3 -> full-name expansion on Windows, and maybe expansions of things like %APPDATA%) and guarantees a single unique canonical name for each actual (existing or non-existing) file would be a very, very, very good thing to have. Essenital, in fact, for comparing filepaths. In any case, I agree that this can all wait until a later time and doesn't need to be in this proposal.
 I haven't done anything to glob (i.e. fnmatch in the current std.path)
 except change its name and improve its documentation.  For all intents
 and purposes, it is not part of my submission.  If I find the time,
 however, I'll look into improving it.

fnglob if you really want "glob" in there). That would prevent confuson both now and whenever it gets improved.

Fair enough. I'll think of a better name and fix it. -Lars
Jul 15 2011
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
Since the details of many of these functions have changed, is it really a 
good idea to make the depricated names aliases of the new functions? If 
anyone has code that relies on the screwy old behavior (such as trying to 
compensate for their problems) then that code will silently break.
Jul 15 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday 15 July 2011 04:38:35 Nick Sabalausky wrote:
 Since the details of many of these functions have changed, is it really a
 good idea to make the depricated names aliases of the new functions? If
 anyone has code that relies on the screwy old behavior (such as trying to
 compensate for their problems) then that code will silently break.

If the behavior has changed, then the old functions need to be there. Otherwise, code will silently break, which would be worse than just outright removing the functions. - Jonathan M Davis
Jul 15 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 01:55:18 -0700, Jonathan M Davis wrote:

 On Friday 15 July 2011 04:38:35 Nick Sabalausky wrote:
 Since the details of many of these functions have changed, is it really
 a good idea to make the depricated names aliases of the new functions?
 If anyone has code that relies on the screwy old behavior (such as
 trying to compensate for their problems) then that code will silently
 break.

If the behavior has changed, then the old functions need to be there. Otherwise, code will silently break, which would be worse than just outright removing the functions.

I'll put them back in. -Lars
Jul 15 2011
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the  
 next item up in the review queue.

Three notes: 1. No tests for extension("foo") is null && extension("foo.") !is null 2. absolutePath's signature could be changed to string absolutePath(string path, string base = getcwd()) for more flexibility at no penalty 3. Would be nice to have relativePath(path, base) to complement absolutePath(). -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message 
news:op.vyny7zfetuzx1w cybershadow.mshome.net...
 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the 
 next item up in the review queue.

Three notes: 1. No tests for extension("foo") is null && extension("foo.") !is null

I don't know if you're talking about actual unittests or documentation, but that brings up a good point: The behavior of extension("foo.") should be clearly documented. (Unless it already is and I just overlooked it?)
 2. absolutePath's signature could be changed to string absolutePath(string 
 path, string base = getcwd()) for more flexibility at no penalty
 3. Would be nice to have relativePath(path, base) to complement 
 absolutePath().

These two are absolutely (no pun intended) fantastic ideas. On #2, you're totally right about "more flexibility at no penalty". And relativePath() is necessary for converting a path into a relocatable form, which definitely has its uses especially when dealing with paths that will end up being used on other computers.
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 7/15/11, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 2. absolutePath's signature could be changed to string absolutePath(string
 path, string base = getcwd()) for more flexibility at no penalty

+1.
Jul 15 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 02:35:02 +0200, Andrej Mitrovic wrote:

 Lars, is normalize basically that feature request (toNativePath) I asked
 for? It does seem to do more than that though.

Oops, sorry. Because your feature request came in on a different channel than the others (my GitHub inbox), I completely forgot about it. normalize() is different -- it changes '/' to '\' on Windows, but it doesn't do anything with '\' on POSIX, since the backslash is a perfectly valid filename character on that platform. Basically, Andrej has requested I add a toNativePath function which replaces all occurrences of '/' in a string with '\' on Windows, and vice versa on POSIX. Thinking some more about it, I personally don't think I'll find this very useful, but I am interested in hearing others' opinions. Does anyone else think toNativePath should be included? -Lars
Jul 15 2011
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 02:45:02 +0200, Andrej Mitrovic wrote:

 All functions have nice names except fcmp, why the abbreviation?

For consistency with std.algorithm.cmp and std.string.icmp, on which it is based. I'd be open for other suggestions if people think it should be changed.
 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

Nope, as specified in the docs it is a no-op on Windows. As Nick points out, it is not obvious what is the correct "home directory" on Windows. Moreover, and perhaps even more importantly, there is no precedence for the tilde character to mean "home directory" on Windows at all. -Lars
Jul 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ivpl34$gf5$3 digitalmars.com...
 On Fri, 15 Jul 2011 02:45:02 +0200, Andrej Mitrovic wrote:

 All functions have nice names except fcmp, why the abbreviation?

For consistency with std.algorithm.cmp and std.string.icmp, on which it is based. I'd be open for other suggestions if people think it should be changed.

I'm happy with it having the obscure name "fcmp" because *all* it is is either a case-insensitive or case-sensitive string comparison, and I think it's dangerous to blindly use that to compare filepaths. If it were named something like "filePathCompare", then people would get a nastly surprise when this returned false instead of true: filePathCompare(r"C:\dir\file.txt", r"C:\dir/foo\..\file.txt"); I can't imagine getting false from that would ever be desirable. This is a big part of why I think we need a canonical() function (although maybe a more like "canonicalPath" for the same reason that we have "joinPath" instead of "join").
 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

Nope, as specified in the docs it is a no-op on Windows. As Nick points out, it is not obvious what is the correct "home directory" on Windows. Moreover, and perhaps even more importantly, there is no precedence for the tilde character to mean "home directory" on Windows at all.

Actually, if Windows did have one single equivalent to "/home/{user}", then I think it would be absolutely fantastic to make tilde mean "home directory" on Windows. But as things are, that's a moot point, of course.
Jul 15 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 14:12:55 +0200, Mafi wrote:

 What about longname becoming longna~1 on dos/emulators etc. Expanding
 this back needs information about the actual state of the file system so
 I don't now if such a function belongs to std.path but a longname.png =>
 longna~1.png belongs IMO to std.path .

That too, requires information about the state of the filesystem. For instance, if the file "longnaps" already exists, and is abbreviated to "longna~1", the correct abbreviation of "longname" would be "longna~2". -Lars
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
The thing although Windows works with forward slashes some tools might
not work with forward slashes, so toNativePath could be useful there.

Then again for example GIT doesn't work with backward slashes (at
least .gitignore doesn't want to work with it). So maybe it should
have an optional parameter that chooses between Posix or Windows
paths..

Or we can just use string.replace("/", r"\"); But having it in
std.path would be more useful.
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
The thing is*
Jul 15 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 17:15:25 +0300, Vladimir Panteleev wrote:

 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:
 
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

Three notes: 1. No tests for extension("foo") is null && extension("foo.") !is null 2. absolutePath's signature could be changed to string absolutePath(string path, string base = getcwd()) for more flexibility at no penalty 3. Would be nice to have relativePath(path, base) to complement absolutePath().

1. I'll fix. 2. Good idea! 3. There are infinitely many possibilities for a relative path, and I don't know how useful this function will be. Please convince me otherwise. :) -lars
Jul 15 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Fri, 15 Jul 2011 18:55:58 +0300, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 3. There are infinitely many possibilities for a relative path

The shortest version is most useful. The idea is to do this: 1) find a common root (if none exists, return input absolute path as is) 2) add as many ../ as necessary to reach common root from base path 3) append second part of input path
 I don't know how useful this function will be.  Please convince me
 otherwise.

Some software and tools will commonly create and output absolute paths. There are non-relocatable; for example, if you want to copy a build script generated by an IDE, you need to update all absolute paths, or convert them to relative ones. My argument is that I recall looking for this function in the old std.path several times, and was disappointed not to find it. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 15 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Jul 2011 15:21:49 -0400, Nick Sabalausky <a a.a> wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ivpl34$gf5$3 digitalmars.com...
 On Fri, 15 Jul 2011 02:45:02 +0200, Andrej Mitrovic wrote:

 All functions have nice names except fcmp, why the abbreviation?

For consistency with std.algorithm.cmp and std.string.icmp, on which it is based. I'd be open for other suggestions if people think it should be changed.

I'm happy with it having the obscure name "fcmp" because *all* it is is either a case-insensitive or case-sensitive string comparison, and I think it's dangerous to blindly use that to compare filepaths. If it were named something like "filePathCompare", then people would get a nastly surprise when this returned false instead of true: filePathCompare(r"C:\dir\file.txt", r"C:\dir/foo\..\file.txt"); I can't imagine getting false from that would ever be desirable. This is a big part of why I think we need a canonical() function (although maybe a more like "canonicalPath" for the same reason that we have "joinPath" instead of "join").
 Is expandTilde supposed to work on Windows? I've never seen tilde used
 there.

Nope, as specified in the docs it is a no-op on Windows. As Nick points out, it is not obvious what is the correct "home directory" on Windows. Moreover, and perhaps even more importantly, there is no precedence for the tilde character to mean "home directory" on Windows at all.

Actually, if Windows did have one single equivalent to "/home/{user}", then I think it would be absolutely fantastic to make tilde mean "home directory" on Windows. But as things are, that's a moot point, of course.

The typical thing is to use the environment variable USERPROFILE, but that's an odd dependency for a string processing library. Windows also doesn't keep the same notion of home directory as Unix does. You can read more about it here: http://en.wikipedia.org/wiki/Environment_variable If that's the only function that uses environment variables, and you can't have the function without it, it might be worth having it. However, we must think about how that affects things like purity. -Steve
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 7/15/11, Nick Sabalausky <a a.a> wrote:
 Actually, if Windows did have one single equivalent to "/home/{user}", then
 I think it would be absolutely fantastic to make tilde mean "home directory"
 on Windows. But as things are, that's a moot point, of course.

%UserProfile% http://en.wikipedia.org/wiki/Home_directory#Default_Home_Directory_per_Operating_System
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Also, you can change the location of your home directory, although
it's not recommended since some applications like to hardwire
directories.UserProfile should reflect the changes though.
Jul 15 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 19:47:04 +0300, Vladimir Panteleev wrote:

 On Fri, 15 Jul 2011 18:55:58 +0300, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 3. There are infinitely many possibilities for a relative path

The shortest version is most useful. The idea is to do this: 1) find a common root (if none exists, return input absolute path as is) 2) add as many ../ as necessary to reach common root from base path 3) append second part of input path

Thinking some more about it, this algorithm should be easy to implement using pathSplitter. I'll get onto it. :) -Lars
Jul 15 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 15:42:50 -0400, Nick Sabalausky wrote:

 "Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message
 news:op.vyny7zfetuzx1w cybershadow.mshome.net...
 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

Three notes: 1. No tests for extension("foo") is null && extension("foo.") !is null

I don't know if you're talking about actual unittests or documentation, but that brings up a good point: The behavior of extension("foo.") should be clearly documented. (Unless it already is and I just overlooked it?)

So what you guys are saying is, basically, that it matters whether a function returns null or ""? I'd be inclined to say it doesn't, and leave it undefined, but I'd be interested in hearing good arguments against this.
 2. absolutePath's signature could be changed to string
 absolutePath(string path, string base = getcwd()) for more flexibility
 at no penalty 3. Would be nice to have relativePath(path, base) to
 complement absolutePath().

you're totally right about "more flexibility at no penalty". And relativePath() is necessary for converting a path into a relocatable form, which definitely has its uses especially when dealing with paths that will end up being used on other computers.

There is a penalty, albeit a small one, because it needs to be verified that isAbsolute(base). -Lars
Jul 15 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 16 Jul 2011 00:26:15 +0300, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Fri, 15 Jul 2011 15:42:50 -0400, Nick Sabalausky wrote:

 "Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message
 news:op.vyny7zfetuzx1w cybershadow.mshome.net...
 On Fri, 15 Jul 2011 03:20:13 +0300, dsimcha <dsimcha yahoo.com> wrote:

 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.

Three notes: 1. No tests for extension("foo") is null && extension("foo.") !is null

I don't know if you're talking about actual unittests or documentation, but that brings up a good point: The behavior of extension("foo.") should be clearly documented. (Unless it already is and I just overlooked it?)

So what you guys are saying is, basically, that it matters whether a function returns null or ""? I'd be inclined to say it doesn't, and leave it undefined, but I'd be interested in hearing good arguments against this.

My main argument for this is that this is the *documented* behavior of getExt in the current version of std.path. But yes, the difference between "" and null is the same as the difference between "empty" and "absent". -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 15 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
Names
-----------
dirSeparator:
I'd love to see this as something much shorter like dirSep. I think that 
dirSeparator is overly long - particularly for something which is likely to be 
used primarily in longer expressions.

currentDirSymbol and parentDirSymbol:
The same here. These are _way_ longer than the strings that they represent. If 
we have them, they should be much shorter (e.g. currDirSym and parDirSym). But 
really, I think that I have to concur with dsimcha on this one. The strings 
for these are completely standard, easy to remember, and even shorter names 
for them would still be longer than the actual strings by a fair margin. We 
should just toss them.

isDirSeparator:
I'd love to see this shortened to isDirSep.

extension:
I'd love to see this shortened to ext.

stripExtension:
I'd love to see this shortened to stripExt.

setExtensions:
I think that it should have replace in its name rather than set, since it's 
not altering the original string. It's replacing what's there, not setting it. 
And again, I think that Extension should be  shortened to Ext, so the renamed 
function would be replaceExt.

defaultExtension:
This function is not a property, so it probably shouldn't be a noun, though 
I'm not quite sure what to call it. Technically, it's something like 
setWithDefaultExtensionIfNoExtension, but that's obviously a horrible name 
even if you abbreviate some of it. So, I don't really have a better name, 
though if it's not gonig to be completely renamed, I definitely think that it 
should be shortened to defaultExt rather than having the full Extension in its 
name.

joinPath:
I'm not sure that this function should be renamed to joinPath. When discussing 
other modules such as as std.ascii and std.uni, we decided that they should 
have the same function names and _not_ put the module name in their names. 
Now, this function _is_ joining paths, so it's not purely a case of having 
the module's name in the function name, but I'm still not sure that we want to 
have it as joinPath instead of join.

absolutePath:
It really shouldn't be a noun, since it's not a property. Something like 
makeAbsolute would be better.

Implementation
------------------------
You should probably consider testing the various functions with varying string 
types - both in size and mutability. Just put the test in a foreach loop like

foreach (S; TypeTuple!(string, wstring, dstring))

(but with whatever combination of string types you want to test; In the ideal 
case it would be every possible combination of character type and mutability, 
but that's not necessarily necessary). Then you'll probably end up using 
to!S() on pretty much every string. And for functions which take multiple 
strings, you can test with a combination of string types. It'll make sure that 
the functions work appropriately with the various types. The functions are 
templated, but sometimes something gets missed and they don't work with a 
particular character size or level of mutability. So, while it might be 
overkill to test _every_ combination, verifying at least some of them would be 
desirable.

Personally, I think that it's better practice to test for empty rather than 
length == 0, but I don't suppose that it really matters given that we're 
always dealing with arrays here.

Your note on baseName about it adhering to the requirements for Posix's 
basename utility should probably put in the actual documentation. The same for 
dirName.

There's no point in checking for "== null" like you do in a few places. If you 
want to check whether it's actually null, you need to check for "is null", and 
if you want to check whether it's empty, then you should just use empty. "== 
null" has a tendancy to be misleading, because people tend to think that it 
actually checks whether an array is null, which it doesn't do. Personally, I'm 
likely to think that it's a bug when I see it.

It would probably be more efficient for setExtension to use ~=. e.g.

auto retval = stripExtension(path);
retval ~= '.';
retval ~= ext;
return cast(typeof(return))(retval);

Maybe it doesn't matter (and the code _is_  a bit less clean that way), but it 
might reduce the number of necessary heap allocations. The same goes for 
pretty much anywhere else that ~ is used. It probably doesn't matter all that 
much, but if we want to get the functions as efficient as possible, then it's 
better to use ~= than ~ unless the compiler is actually smart enough to 
tranlate the ~'s into ~='s, but I doubt that it is (particularly since some 
chunk of that is in druntime rather than in the compiler itself).

There's no such thing as an OutOfMemoryException. It's an OutOfMemoryError. 
And being an error, I don't think that there's any need to mention it in the 
documentation. _Anything_ which allocates memory could cause the application 
to run out of memory, and being an Error, you're really not supposed to catch 
it, so there's no real point in mentioning it in the documentation.


Okay. I think that that covers it, though I might have missed something. Aside 
from a few nitpicks, my main issue with the module as it stands is the 
unnecessarily long function names. The function names should be as long as is 
necessary to make them clear but not longer. And putting Extension and 
Separator in names is completely unnecessary. Ext and Sep are quite clear 
enough (especially in context) and are _much_ shorter. The longer names add 
nothing and make the functions/variables much more annoying to deal with in 
longer expressions.

So overall, I think that it looks good and is a definite improvement, but I 
really think that the function names need to be shortened.

- Jonathan M Davis
Jul 15 2011
parent "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.1680.1310785695.14074.digitalmars-d puremagic.com...
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When 
 discussing
 other modules such as as std.ascii and std.uni, we decided that they 
 should
 have the same function names and _not_ put the module name in their names.
 Now, this function _is_ joining paths, so it's not purely a case of having
 the module's name in the function name, but I'm still not sure that we 
 want to
 have it as joinPath instead of join.

I really, really, REALLY, *REALLY* think it's a big mistake to categorically ban using the module name in the function names. And its *especially* bad in this case. Having both std.algorithm.join and std.path.join with their different semantics is just begging for problems. "join" is a terrible, terrible, awful, horrible name for this function. I don't care if using conflicting names between std.ascii and std.uni was deemed OK. Every case needs to be judged on its own or else we're knee-deep in bondage & discipline design and we may as well call ourselves "Sun Microsystems" and our language "Java". And yes, the precedent set by std.ascii and std.uni is a factor in the decision, but I think the rest of the situation here easily overrides that concern. Hell, when people see "join" they think of the semantics of std.algortihm.join, *never* "Oh, 'join' means using path separators." And looking at the names of the arguments, or other context clues, isn't a valid counter-argument because when I see "join(path1, path2, filename)" I'm going to immediately think "bug, or at least overly fragile". And why not? From the name "join", I have no way to tell. I don't give a crap about "Ext" vs "Extension" (they both get the job done), and I think that horse has already been beaten to death last time std.path was discussed. But calling joinPath "join", is bad, bad, bad, bad, BAD.
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I'd rather have `extension` renamed to getExt. `ext` is used way too
often in user-code.
Jul 15 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 16 July 2011 05:34:06 Andrej Mitrovic wrote:
 I'd rather have `extension` renamed to getExt. `ext` is used way too
 often in user-code.

extension is a bit iffy, since it's just the one word. So, I'm not as concerned about that one (and I'd rather have it be a property than a have it be the normal function getExt). But all of the others where Extension is just part of a longer name just don't need to be that long. The combination of the name and its usage should make it as clear with just Ext as it is with Extension, and the shorter names are much more pleasant to deal with - especially since (in my experience at least) the various std.path functions get used together in larger expressions such that the longer names are definitely an issue. So, while I'm not altogether fond of extension, it's fine with me if it stays as extension. But in the cases where Extension is just part of a longer name, I'd _really_ like to see them shortened to Ext. - Jonathan M Davis
Jul 15 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 16 July 2011 01:10:02 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.1680.1310785695.14074.digitalmars-d puremagic.com...
 
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When
 discussing
 other modules such as as std.ascii and std.uni, we decided that they
 should
 have the same function names and _not_ put the module name in their
 names. Now, this function _is_ joining paths, so it's not purely a case
 of having the module's name in the function name, but I'm still not
 sure that we want to
 have it as joinPath instead of join.

I really, really, REALLY, *REALLY* think it's a big mistake to categorically ban using the module name in the function names. And its *especially* bad in this case. Having both std.algorithm.join and std.path.join with their different semantics is just begging for problems. "join" is a terrible, terrible, awful, horrible name for this function. I don't care if using conflicting names between std.ascii and std.uni was deemed OK. Every case needs to be judged on its own or else we're knee-deep in bondage & discipline design and we may as well call ourselves "Sun Microsystems" and our language "Java". And yes, the precedent set by std.ascii and std.uni is a factor in the decision, but I think the rest of the situation here easily overrides that concern. Hell, when people see "join" they think of the semantics of std.algortihm.join, *never* "Oh, 'join' means using path separators." And looking at the names of the arguments, or other context clues, isn't a valid counter-argument because when I see "join(path1, path2, filename)" I'm going to immediately think "bug, or at least overly fragile". And why not? From the name "join", I have no way to tell. I don't give a crap about "Ext" vs "Extension" (they both get the job done), and I think that horse has already been beaten to death last time std.path was discussed. But calling joinPath "join", is bad, bad, bad, bad, BAD.

Well, like I said, I'm not sure that naming it joinPath instead of join is a good idea, but I'm not sure that it's a bad idea either. In this particular case, path is not only the module name, but it's actually what's being joined. So, I'm not entirely comfortable naming it joinPath, but I'm not entirely comfortable naming it join either. I think that the general decision to not put module names in function names needs to be brought up in this case, since that is essentially what is happening here, but it may very well be that this is a case where we should not follow that general rule. - Jonathan M Davis
Jul 15 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I was quite fond of it being renamed to joinPath, because when I'm
doing string processing on paths I import both std.algorithm/std.array
and std.path, and so far I've always had conflicts with join()
functions.
Jul 16 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 15 Jul 2011 20:08:03 -0700, Jonathan M Davis wrote:

 Names
 -----------
 dirSeparator:
 I'd love to see this as something much shorter like dirSep. I think that
 dirSeparator is overly long - particularly for something which is likely
 to be used primarily in longer expressions.
 
 currentDirSymbol and parentDirSymbol: The same here. These are _way_
 longer than the strings that they represent. If we have them, they
 should be much shorter (e.g. currDirSym and parDirSym). But really, I
 think that I have to concur with dsimcha on this one. The strings for
 these are completely standard, easy to remember, and even shorter names
 for them would still be longer than the actual strings by a fair margin.
 We should just toss them.
 
 isDirSeparator:
 I'd love to see this shortened to isDirSep.
 
 extension:
 I'd love to see this shortened to ext.
 
 stripExtension:
 I'd love to see this shortened to stripExt.

This has been discussed before, and even voted upon. While the voting was very informal, with hardly enough participants to claim statistical significance, it at least showed that the current names have (perhaps even majority) support in the community: http://www.digitalmars.com/d/archives/digitalmars/D/ Proposal_for_std.path_replacement_131121.html#N131332
 setExtensions:
 I think that it should have replace in its name rather than set, since
 it's not altering the original string. It's replacing what's there, not
 setting it. And again, I think that Extension should be  shortened to
 Ext, so the renamed function would be replaceExt.

To me, "replace" has an even stronger connotation than "set" that the original string is being modified. Even so, that's actually not the reason I chose "set". Rather, I think that - replaceExtension implies that an existing extension gets replaced, - addExtension (the current std.path uses addExt) implies that an extension gets added -- appended, that is -- regardless of whether one exists already, while - setExtension implies that the extension is set to the given value whether the path already has one or not. Maybe that's just me, but at least I'd like to get more opinions before changing it.
 defaultExtension:
 This function is not a property, so it probably shouldn't be a noun,
 though I'm not quite sure what to call it. Technically, it's something
 like setWithDefaultExtensionIfNoExtension, but that's obviously a
 horrible name even if you abbreviate some of it. So, I don't really have
 a better name, though if it's not gonig to be completely renamed, I
 definitely think that it should be shortened to defaultExt rather than
 having the full Extension in its name.
 
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When
 discussing other modules such as as std.ascii and std.uni, we decided
 that they should have the same function names and _not_ put the module
 name in their names. Now, this function _is_ joining paths, so it's not
 purely a case of having the module's name in the function name, but I'm
 still not sure that we want to have it as joinPath instead of join.

Nick expressed my feelings about this naming choice pretty well. My main reason for not going with join() here is that this function does something very different from the other join().
 absolutePath:
 It really shouldn't be a noun, since it's not a property. Something like
 makeAbsolute would be better.

I originally called it toAbsolute(), but someone complained about that, too. Sigh. When it comes to function names, it is very hard to please everyone.
 Implementation
 ------------------------
 You should probably consider testing the various functions with varying
 string types - both in size and mutability. Just put the test in a
 foreach loop like
 
 foreach (S; TypeTuple!(string, wstring, dstring))
 
 (but with whatever combination of string types you want to test; In the
 ideal case it would be every possible combination of character type and
 mutability, but that's not necessarily necessary). Then you'll probably
 end up using to!S() on pretty much every string. And for functions which
 take multiple strings, you can test with a combination of string types.
 It'll make sure that the functions work appropriately with the various
 types. The functions are templated, but sometimes something gets missed
 and they don't work with a particular character size or level of
 mutability. So, while it might be overkill to test _every_ combination,
 verifying at least some of them would be desirable.

Actually, I do this in most unittests, although not as systematically as what you suggest. Look at the baseName unittests, for example: "file.ext"w is a wstring, "file.ext"d is a dstring and "file"w.dup is a wchar[]. The only thing I'm not testing is const(T)[]. Maybe your approach would be better.
 Personally, I think that it's better practice to test for empty rather
 than length == 0, but I don't suppose that it really matters given that
 we're always dealing with arrays here.

I agree.
 Your note on baseName about it adhering to the requirements for Posix's
 basename utility should probably put in the actual documentation. The
 same for dirName.

I considered doing that, but I was afraid of alienating Windows programmers. It's probably a good idea.
 There's no point in checking for "== null" like you do in a few places.
 If you want to check whether it's actually null, you need to check for
 "is null", and if you want to check whether it's empty, then you should
 just use empty. "== null" has a tendancy to be misleading, because
 people tend to think that it actually checks whether an array is null,
 which it doesn't do. Personally, I'm likely to think that it's a bug
 when I see it.

Agreed. I'll fix that.
 It would probably be more efficient for setExtension to use ~=. e.g.
 
 auto retval = stripExtension(path);
 retval ~= '.';
 retval ~= ext;
 return cast(typeof(return))(retval);

 Maybe it doesn't matter (and the code _is_  a bit less clean that way),
 but it might reduce the number of necessary heap allocations. The same
 goes for pretty much anywhere else that ~ is used. It probably doesn't
 matter all that much, but if we want to get the functions as efficient
 as possible, then it's better to use ~= than ~ unless the compiler is
 actually smart enough to tranlate the ~'s into ~='s, but I doubt that it
 is (particularly since some chunk of that is in druntime rather than in
 the compiler itself).

Yeah, David pointed this out as well.
 There's no such thing as an OutOfMemoryException. It's an
 OutOfMemoryError. And being an error, I don't think that there's any
 need to mention it in the documentation. _Anything_ which allocates
 memory could cause the application to run out of memory, and being an
 Error, you're really not supposed to catch it, so there's no real point
 in mentioning it in the documentation.

I'll fix the documentation.
 Okay. I think that that covers it, though I might have missed something.
 Aside from a few nitpicks, my main issue with the module as it stands is
 the unnecessarily long function names. The function names should be as
 long as is necessary to make them clear but not longer. And putting
 Extension and Separator in names is completely unnecessary. Ext and Sep
 are quite clear enough (especially in context) and are _much_ shorter.
 The longer names add nothing and make the functions/variables much more
 annoying to deal with in longer expressions.
 
 So overall, I think that it looks good and is a definite improvement,
 but I really think that the function names need to be shortened.

Thanks for a thorough review! :) -Lars
Jul 16 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 16 July 2011 19:39:13 Lars T. Kyllingstad wrote:
 On Fri, 15 Jul 2011 20:08:03 -0700, Jonathan M Davis wrote:
 Names
 -----------
 dirSeparator:
 I'd love to see this as something much shorter like dirSep. I think that
 dirSeparator is overly long - particularly for something which is likely
 to be used primarily in longer expressions.
 
 currentDirSymbol and parentDirSymbol: The same here. These are _way_
 longer than the strings that they represent. If we have them, they
 should be much shorter (e.g. currDirSym and parDirSym). But really, I
 think that I have to concur with dsimcha on this one. The strings for
 these are completely standard, easy to remember, and even shorter names
 for them would still be longer than the actual strings by a fair margin.
 We should just toss them.
 
 isDirSeparator:
 I'd love to see this shortened to isDirSep.
 
 extension:
 I'd love to see this shortened to ext.
 
 stripExtension:
 I'd love to see this shortened to stripExt.

This has been discussed before, and even voted upon. While the voting was very informal, with hardly enough participants to claim statistical significance, it at least showed that the current names have (perhaps even majority) support in the community: http://www.digitalmars.com/d/archives/digitalmars/D/ Proposal_for_std.path_replacement_131121.html#N131332

Well, looking at that, it looks like it was pretty much a toss up on most of those, with them being one vote off from one another. There just isn't a large enough sample there to say much in most cases. It could easily go either way if a decent number of people voted on it. Personally, I'm definitely against putting Extension and Separator in the names because it makes them much longer without really adding any information.
 setExtensions:
 I think that it should have replace in its name rather than set, since
 it's not altering the original string. It's replacing what's there, not
 setting it. And again, I think that Extension should be  shortened to
 Ext, so the renamed function would be replaceExt.

To me, "replace" has an even stronger connotation than "set" that the original string is being modified. Even so, that's actually not the reason I chose "set". Rather, I think that - replaceExtension implies that an existing extension gets replaced, - addExtension (the current std.path uses addExt) implies that an extension gets added -- appended, that is -- regardless of whether one exists already, while - setExtension implies that the extension is set to the given value whether the path already has one or not. Maybe that's just me, but at least I'd like to get more opinions before changing it.

Well, it's certainly debatable, but I don't like the idea of having set in a function which isn't altering its arguments. It is a valid point about replace impliing that an extension exists, but you could also argue that every file has an extension but that some have an empty extension, so replace is replacing the empty one - though as it stands if you try and do the reverse and give setExtension an empty extension, you end up with "." on the end instead of nothing. So, anyway, it's certainly debatable, but personally, I think that set shouldn't be used on functions which don't alter at least one of their arguments.
 defaultExtension:
 This function is not a property, so it probably shouldn't be a noun,
 though I'm not quite sure what to call it. Technically, it's something
 like setWithDefaultExtensionIfNoExtension, but that's obviously a
 horrible name even if you abbreviate some of it. So, I don't really have
 a better name, though if it's not gonig to be completely renamed, I
 definitely think that it should be shortened to defaultExt rather than
 having the full Extension in its name.
 
 joinPath:
 I'm not sure that this function should be renamed to joinPath. When
 discussing other modules such as as std.ascii and std.uni, we decided
 that they should have the same function names and _not_ put the module
 name in their names. Now, this function _is_ joining paths, so it's not
 purely a case of having the module's name in the function name, but I'm
 still not sure that we want to have it as joinPath instead of join.

Nick expressed my feelings about this naming choice pretty well. My main reason for not going with join() here is that this function does something very different from the other join().

Actually, it really isn't all that much different from std.array.join. It's just that it's always using a specific separator, and unlike std.array.join, it doesn't add the separator if it's already there. So, it wouldn't be at all amiss to call it just join. But I can understand not wanting the two to have the same name. As I said, I'm a bit divided on it. In general, we shouldn't be putting the module's naming in a function's name, but at the same time, joinPath _is_ joining paths rather than path just being the module name. I brought it up because it does fall in the camp of functions with the same name with essentially the same functionality (albeit tweaked for their particular module and use case), and it was decided that in the general case, such functions should have the same name and let the module system be used to distinguish them. But while that's what we should be doing in the general case, that doesn't necessarily mean that we _have_ to do it in every case. The pros and cons must be weighed with a major reason to use the same naming being that that's the general convention. But if there's enough reason to use a different name, then we can use a different name. I don't feel strongly about it in either direction in this particular case. I think that it could go either way. I just felt that it needed to be brought up.
 absolutePath:
 It really shouldn't be a noun, since it's not a property. Something like
 makeAbsolute would be better.

I originally called it toAbsolute(), but someone complained about that, too. Sigh. When it comes to function names, it is very hard to please everyone.

Well, I'm not terribly hung up on what the exact name is, but in general, variables, properties, and types should be nouns whereas functions should be verbs. absolutePath is a normal function, but its name is a noun. And as such, I think that makeAbsolute or toAbsolute makes a lot more sense. If other people think that path should be in the name for clarity or whatnot, then it could be change to toAbsPath or something similar.
 Implementation
 ------------------------
 You should probably consider testing the various functions with varying
 string types - both in size and mutability. Just put the test in a
 foreach loop like
 
 foreach (S; TypeTuple!(string, wstring, dstring))
 
 (but with whatever combination of string types you want to test; In the
 ideal case it would be every possible combination of character type and
 mutability, but that's not necessarily necessary). Then you'll probably
 end up using to!S() on pretty much every string. And for functions which
 take multiple strings, you can test with a combination of string types.
 It'll make sure that the functions work appropriately with the various
 types. The functions are templated, but sometimes something gets missed
 and they don't work with a particular character size or level of
 mutability. So, while it might be overkill to test _every_ combination,
 verifying at least some of them would be desirable.

Actually, I do this in most unittests, although not as systematically as what you suggest. Look at the baseName unittests, for example: "file.ext"w is a wstring, "file.ext"d is a dstring and "file"w.dup is a wchar[]. The only thing I'm not testing is const(T)[]. Maybe your approach would be better.
 Personally, I think that it's better practice to test for empty rather
 than length == 0, but I don't suppose that it really matters given that
 we're always dealing with arrays here.

I agree.
 Your note on baseName about it adhering to the requirements for Posix's
 basename utility should probably put in the actual documentation. The
 same for dirName.

I considered doing that, but I was afraid of alienating Windows programmers. It's probably a good idea.

It makes it clearer what it's doing, and unless Windows has an equivalent function, then it's not like you're choosing the functionality of one over the other. And from the little poking around I just did, it seems that Windows does _not_ have an equivalent function (apparently you have to use something like _splitpath). So, I think that it's just better to put in tho docs. It makes it clearer, and if the link has good info, then it's that much better.
 There's no point in checking for "== null" like you do in a few places.
 If you want to check whether it's actually null, you need to check for
 "is null", and if you want to check whether it's empty, then you should
 just use empty. "== null" has a tendancy to be misleading, because
 people tend to think that it actually checks whether an array is null,
 which it doesn't do. Personally, I'm likely to think that it's a bug
 when I see it.

Agreed. I'll fix that.
 It would probably be more efficient for setExtension to use ~=. e.g.
 
 auto retval = stripExtension(path);
 retval ~= '.';
 retval ~= ext;
 return cast(typeof(return))(retval);
 
 Maybe it doesn't matter (and the code _is_  a bit less clean that way),
 but it might reduce the number of necessary heap allocations. The same
 goes for pretty much anywhere else that ~ is used. It probably doesn't
 matter all that much, but if we want to get the functions as efficient
 as possible, then it's better to use ~= than ~ unless the compiler is
 actually smart enough to tranlate the ~'s into ~='s, but I doubt that it
 is (particularly since some chunk of that is in druntime rather than in
 the compiler itself).

Yeah, David pointed this out as well.
 There's no such thing as an OutOfMemoryException. It's an
 OutOfMemoryError. And being an error, I don't think that there's any
 need to mention it in the documentation. _Anything_ which allocates
 memory could cause the application to run out of memory, and being an
 Error, you're really not supposed to catch it, so there's no real point
 in mentioning it in the documentation.

I'll fix the documentation.

Actually, core.memory needs to be fixed on that count (that's probably where you got the idea about OutOfMemoryException). I should probably create a pull request to fix that. - Jonathan M Davis
Jul 16 2011
prev sibling next sibling parent reply torhu <no spam.invalid> writes:
On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28.  This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

Looks nice and clean, both docs and code! I like modules that are not overengineered :) I noticed a couple of things that I don't think have been mentioned already: The docs for defaultExtension say that there's one case where doesn't allocate. But the code says that it always does (.idup isn't conditional, it's just .dup with a different return type). joinPath also uses .idup, thus always allocates. Maybe there could be a documented allocation policy for the module as a whole? Copy-on-write used to be the general rule for Phobos, don't know if that's true anymore. For most of the functions in this module, even a single heap allocation could be more expensive than the rest of what they do. I've read the discussions about the function names, but I still dare to make a suggestion of my own. I feel like joinPath is a bit iffy, shouldn't it be plural somehow? What it does is to create a path by putting pieces together, some paths in their own right, some not. My suggestion is to simply call it buildPath.
Jul 17 2011
next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

 On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28.  This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/



overengineered :)

Thanks!
 I noticed a couple of things that I don't think have been mentioned
 already:
 
 The docs for defaultExtension say that there's one case where doesn't
 allocate.  But the code says that it always does (.idup isn't
 conditional, it's just .dup with a different return type).

There is no reason for idup to duplicate an array which is already immutable. If it does, I'd say it is a compiler bug.
 joinPath also uses .idup, thus always allocates.  Maybe there could be a
 documented allocation policy for the module as a whole?  Copy-on-write
 used to be the general rule for Phobos, don't know if that's true
 anymore.  For most of the functions in this module, even a single heap
 allocation could be more expensive than the rest of what they do.

True. I've tried keeping allocations to a minimum. If you see other places where an allocation could be avoided (besides the idups), please let me know. I'm not sure what you mean by copy-on-write in this case. The functions in this module never modify the input arrays.
 I've read the discussions about the function names, but I still dare to
 make a suggestion of my own.  I feel like joinPath is a bit iffy,
 shouldn't it be plural somehow?  What it does is to create a path by
 putting pieces together, some paths in their own right, some not.  My
 suggestion is to simply call it buildPath.

Good point. joinPaths would be another option. -Lars
Jul 18 2011
parent torhu <no spam.invalid> writes:
On 18.07.2011 15:16, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

  joinPath also uses .idup, thus always allocates.  Maybe there could be a
  documented allocation policy for the module as a whole?  Copy-on-write
  used to be the general rule for Phobos, don't know if that's true
  anymore.  For most of the functions in this module, even a single heap
  allocation could be more expensive than the rest of what they do.

True. I've tried keeping allocations to a minimum. If you see other places where an allocation could be avoided (besides the idups), please let me know. I'm not sure what you mean by copy-on-write in this case. The functions in this module never modify the input arrays.

Sorry, don't know why I thought of COW in this case. Don't heap allocate if you don't need to, was what I meant. Which was because .idup was used, so nevermind that now :)
  I've read the discussions about the function names, but I still dare to
  make a suggestion of my own.  I feel like joinPath is a bit iffy,
  shouldn't it be plural somehow?  What it does is to create a path by
  putting pieces together, some paths in their own right, some not.  My
  suggestion is to simply call it buildPath.

Good point. joinPaths would be another option.

I didn't suggest joinPaths, because the inputs are not always paths, generally you join things like a path to a directory together with just a filename. Maybe it's ok to call it all just 'paths', I don't know. But the output is commonly a path, hence buildPath. Would be interesting to know what other people think, though.
Jul 18 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Jul 2011 09:16:35 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

 On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at the
 ends two weeks from now on July 28.  This will be followed by a week of
 voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/



overengineered :)

Thanks!
 I noticed a couple of things that I don't think have been mentioned
 already:

 The docs for defaultExtension say that there's one case where doesn't
 allocate.  But the code says that it always does (.idup isn't
 conditional, it's just .dup with a different return type).

There is no reason for idup to duplicate an array which is already immutable. If it does, I'd say it is a compiler bug.

No, it has to duplicate. Part of idup is saying "I want an immutable return" and the other part is saying "I want a duplicate" For example: string x = "hello".idup; string y = x; assert(x.capacity > 0); // would fail if idup just returned the same thing x ~= " world"; assert(y is x); // would fail. The compiler cannot know whether you care or not. If you don't care, why are you calling idup on an immutable array? Just use it. -Steve
Jul 18 2011
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:j01h2f$2ia$2 digitalmars.com...
 The specific cases in question are templated functions, where I don't
 know whether the array is immutable unless I specifically test it.  I
 simply used arr.idup, assuming the compiler would optimise away the
 duplication when arr is immutable.  But it's no big deal to change it to

   static if (!is(T == immutable)) return arr.idup;

 -Lars

Doesn't std.conv.to do this for you?
Jul 18 2011
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-07-18 08:08, Lars T. Kyllingstad wrote:
 On Tue, 19 Jul 2011 00:57:44 +1000, Daniel Murphy wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:j01h2f$2ia$2 digitalmars.com...
 
 The specific cases in question are templated functions, where I don't
 know whether the array is immutable unless I specifically test it. I
 simply used arr.idup, assuming the compiler would optimise away the
 duplication when arr is immutable. But it's no big deal to change it
 to
 
 static if (!is(T == immutable)) return arr.idup;
 
 -Lars

Doesn't std.conv.to do this for you?

It probably does! I hadn't thought about that, but I'll check it out now. Thanks!

It does. That's why you should generally use std.conv.to instead of dup or idup directly. Really, dup and idup should only be used if you definitely _want_ a duplicate. Otherwise, std.conv.to will just (i)dup it when you actually need to. - Jonathan M Davis
Jul 18 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 18 Jul 2011 10:04:26 -0400, Steven Schveighoffer wrote:

 On Mon, 18 Jul 2011 09:16:35 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 On Sun, 17 Jul 2011 12:30:39 +0200, torhu wrote:

 On 15.07.2011 02:20, dsimcha wrote:
 Lars Kyllingstad's new and improved std.path module for Phobos is the
 next item up in the review queue.  I've volunteered to be the review
 manager.  Barring any objections, the review starts now and ends at
 the ends two weeks from now on July 28.  This will be followed by a
 week of voting, ending August 4th.

 Code:
 https://github.com/kyllingstad/phobos/blob/std-path/std/path.d

 Docs:
 http://www.kyllingen.net/code/new-std-path/phobos-prerelease/



overengineered :)

Thanks!
 I noticed a couple of things that I don't think have been mentioned
 already:

 The docs for defaultExtension say that there's one case where doesn't
 allocate.  But the code says that it always does (.idup isn't
 conditional, it's just .dup with a different return type).

There is no reason for idup to duplicate an array which is already immutable. If it does, I'd say it is a compiler bug.

No, it has to duplicate. Part of idup is saying "I want an immutable return" and the other part is saying "I want a duplicate" For example: string x = "hello".idup; string y = x; assert(x.capacity > 0); // would fail if idup just returned the same thing x ~= " world"; assert(y is x); // would fail. The compiler cannot know whether you care or not.

Ah, ok.
 If you don't care, why are you calling idup on an immutable array?  Just
 use it.

The specific cases in question are templated functions, where I don't know whether the array is immutable unless I specifically test it. I simply used arr.idup, assuming the compiler would optimise away the duplication when arr is immutable. But it's no big deal to change it to static if (!is(T == immutable)) return arr.idup; -Lars
Jul 18 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Tue, 19 Jul 2011 00:57:44 +1000, Daniel Murphy wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:j01h2f$2ia$2 digitalmars.com...
 The specific cases in question are templated functions, where I don't
 know whether the array is immutable unless I specifically test it.  I
 simply used arr.idup, assuming the compiler would optimise away the
 duplication when arr is immutable.  But it's no big deal to change it
 to

   static if (!is(T == immutable)) return arr.idup;

 -Lars

Doesn't std.conv.to do this for you?

It probably does! I hadn't thought about that, but I'll check it out now. Thanks! -Lars
Jul 18 2011
prev sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 18 Jul 2011 20:47:20 +0200, torhu wrote:

 I didn't suggest joinPaths, because the inputs are not always paths,
 generally you join things like a path to a directory together with just
 a filename.  Maybe it's ok to call it all just 'paths', I don't know.
 But the output is commonly a path, hence buildPath.  Would be
 interesting to know what other people think, though.

See my answer to Steve. I'm considering expanding joinPath's charter a bit, allowing it to take an input range of path segments and perform normalization while concatenating them. buildPath would be a good name for this. -Lars
Jul 20 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sat, 16 Jul 2011 14:25:36 -0700, Jonathan M Davis wrote:

 There's no such thing as an OutOfMemoryException. It's an
 OutOfMemoryError. And being an error, I don't think that there's any
 need to mention it in the documentation. _Anything_ which allocates
 memory could cause the application to run out of memory, and being an
 Error, you're really not supposed to catch it, so there's no real
 point in mentioning it in the documentation.

I'll fix the documentation.

Actually, core.memory needs to be fixed on that count (that's probably where you got the idea about OutOfMemoryException). I should probably create a pull request to fix that.

That function (expandTilde) and its documentation is from the current std.path, dating back to D1. I haven't made any significant changes to it. What it does is call druntime's onOutOfMemoryError() handler, which I guess throws an OutOfMemoryException in D1(?) -Lars
Jul 17 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 15 Jul 2011 16:15:23 -0400, Nick Sabalausky <a a.a> wrote:

 "Steven Schveighoffer" <schveiguy yahoo.com> wrote in message
 news:op.vyoeirtveav7ka localhost.localdomain...
 On Fri, 15 Jul 2011 15:21:49 -0400, Nick Sabalausky <a a.a> wrote:

 Actually, if Windows did have one single equivalent to "/home/{user}",
 then
 I think it would be absolutely fantastic to make tilde mean "home
 directory"
 on Windows. But as things are, that's a moot point, of course.

The typical thing is to use the environment variable USERPROFILE, but that's an odd dependency for a string processing library. Windows also doesn't keep the same notion of home directory as Unix does. You can read more about it here: http://en.wikipedia.org/wiki/Environment_variable If that's the only function that uses environment variables, and you can't have the function without it, it might be worth having it. However, we must think about how that affects things like purity.

No, like I've said, all the possible choices are frequently wrong, and that includes USERPROFILE. If you're selecting a default directory for the user to save/load a word processing document, then USERPROFILE is wrong (should be the user's My Documents - or the last used directory, whatever). If you're storing per-user application-settings/internal-application-data, then USERPROFILE is wrong. It should be APPDATA if it's roaming data (should remain with the user on different computers). Or if it's machine-specific (non-roaming) data, then it belongs in whatever the env variable for "%APPDATA%\Local Settings\Application Data" is. So expanding tilde to USERPROFILE will just encourage people to do the wrong thing.

Certainly not having that function on Windows will encourage them to choose a wrong choice then, no? I mean if they want to find a "home directory" and phobos doesn't support that, then they google online, find something like %userprofile%, use it, and maybe file a bug against phobos for good measure :) So there is no right answer, we should at least come up with *an* answer. In the Windows world of de-facto standards, we're bound to start a trend ;) What about this? ~/Application Data => %APPDATA% ~/Local Settings/Application Data => %LOCALAPPDATA% or %USERPROFILE%\Local Settings\Application Data etc. Last resort: ~ => %USERPROFILE% In other words, expand more than just the tilde. My thoughts are, if we try and take a stab at making something intelligently decide where to store files, people will appreciate it more than having to search the web for a right answer to something that doesn't have one. -Steve
Jul 18 2011
prev sibling next sibling parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
Looks nice.

a thought:
alternate data streams are fringe and probably don't warrant support (you might
look into them), but windows \\?\ paths probably should be supported in eg
joinPath and pathSplitter
Jul 18 2011
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 18 Jul 2011 16:48:29 +0000, Ellery Newcomer wrote:

 Looks nice.
 
 a thought:
 alternate data streams are fringe and probably don't warrant support
 (you might look into them), but windows \\?\ paths probably should be
 supported in eg joinPath and pathSplitter

I'll look into it. :) -Lars
Jul 20 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 18 Jul 2011 14:29:08 -0400, torhu <no spam.invalid> wrote:

 On 18.07.2011 15:55, Steven Schveighoffer wrote:
 ...
 Certainly not having that function on Windows will encourage them to
 choose a wrong choice then, no?  I mean if they want to find a "home
 directory" and phobos doesn't support that, then they google online,  
 find
 something like %userprofile%, use it, and maybe file a bug against  
 phobos
 for good measure :)

 So there is no right answer, we should at least come up with *an*  
 answer.
 In the Windows world of de-facto standards, we're bound to start a  
 trend ;)

 What about this?

 ~/Application Data =>  %APPDATA%
 ~/Local Settings/Application Data =>  %LOCALAPPDATA% or  
 %USERPROFILE%\Local
 Settings\Application Data
 etc.
 Last resort:
 ~ =>  %USERPROFILE%

 In other words, expand more than just the tilde.

Wouldn't that tend to cause the same issue? They use tilde on Linux, then get %userprofile% because they didn't change it when they port to Windows. If people know that they want %appdata%, they could just use that.

So without expansion they get some local directory named ~ created (~ is a valid directory name in Windows). That sounds worse... I'd rather have it do something sane when I'm not smart enough to special-code Windows stuff, but when I do want to handle it, I'd like to not have to deal with the environment (i.e. differences between XP and Win7).
 My thoughts are, if we try and take a stab at making something
 intelligently decide where to store files, people will appreciate it  
 more
 than having to search the web for a right answer to something that  
 doesn't
 have one.

That could be done, here's an example I'm familiar with: http://www.allegro.cc/manual/5/al_get_standard_path This is designed to work across all the major platforms. Some paths will be the same on some platforms. ALLEGRO_USER_DOCUMENTS_PATH, ALLEGRO_USER_DATA_PATH, and ALLEGRO_USER_SETTINGS_PATH are probably the most relevant ones for this discussion.

I don't know if it's worth mapping the Windows directory scheme on to Linux any more than it's worth mapping the Linux conventions onto the Windows ones. The one thing that's common is, both have a OS-specified location to store a specific user's file (Home directory). I realize that using %USERPROFILE% is not *correct*. But it's better than doing *nothing*. So my view is: ~ => by itself it maps to your home directory, be that /home/you or %USERPROFILE%. ~/xyz => on Linux maps to a subdirectory xyz under home, on Windows, may be a more intelligent expansion. -Steve
Jul 18 2011