digitalmars.D - std.path review: update

Lars T. Kyllingstad (29/29) Jul 17 2011 Based on your comments, I have made some changes to my std.path

bearophile (4/5) Jul 17 2011 compatibleStrings is a template still.

Lars T. Kyllingstad (9/14) Jul 18 2011 I know. Sorry, forgot to mention that. For now, I'd like to keep it th...

Jonathan M Davis (9/21) Jul 18 2011 And it _should_ be a template. All of the stuff like that are templates....

bearophile (25/32) Jul 18 2011 This seems to work:

Jonathan M Davis (12/52) Jul 18 2011 Okay. Yes, you could do that. But what you're doing is basically the sam...

bearophile (4/14) Jul 18 2011 The gain of my version is that it doesn't generate tons of templates. Fr...

Andrej Mitrovic (10/12) Jul 17 2011 Actually I withdraw that feature request. Some tools will work with

Lars T. Kyllingstad (3/10) Jul 18 2011 Noted. :)
Steven Schveighoffer (6/19) Jul 18 2011 Hum... I wonder if normalize should do this...

Lars T. Kyllingstad (11/36) Jul 18 2011 normalize does this on Windows, where '/' is also a directory separator,...

Steven Schveighoffer (13/47) Jul 18 2011 OK, this is what I meant. By canonical path, I mean I should be able to...

Jesse Phillips (7/15) Jul 17 2011 I'm not sure my opinion on this. It seems like a useful idea, but as
Brian Schott (3/3) Jul 17 2011 The documentation comments for driveName say that the return value will

Jonathan M Davis (13/16) Jul 17 2011 The fun part with that is that "" == null and a null string is empty per...

Lars T. Kyllingstad (4/22) Jul 18 2011 Pending a decision on the null vs. empty issue, I have now standardised

torhu (17/38) Jul 18 2011 I'd like to make a case for null as the 'nothing here' value.

Lars T. Kyllingstad (8/52) Jul 18 2011 True, but the question was not whether one should use null or "" for the...

torhu (7/31) Jul 18 2011 I meant to imply that null and empty should not be used to mean two

Jonathan M Davis (17/49) Jul 18 2011 There are definitely situations where it is valuable to differentiate be...

Andrei Alexandrescu (7/37) Jul 18 2011 Note that there are two aspects: generating 'nothing here' values, and

Lars T. Kyllingstad (17/57) Jul 18 2011 Some have argued that there is an extra dimension to this, namely the

Vladimir Panteleev (6/8) Jul 18 2011 Is it still an implementation detail if it's documented behavior?

Vladimir Panteleev (1/1) Jul 18 2011 Sorry, I thought you meant the old getExt().

Steven Schveighoffer (12/48) Jul 18 2011 The one that's kind of nice is the if(path.extension), which reads not

Lars T. Kyllingstad (4/11) Jul 18 2011 It seems I forgot about the CTFEability tests. I'll fix that too, and

Lars T. Kyllingstad (6/18) Jul 18 2011 Done. Most functions were CTFEable without any modifications (thanks,

Steven Schveighoffer (27/31) Jul 18 2011 This is a review of the docs/design. I'll review the code separately:

Lars T. Kyllingstad (30/75) Jul 18 2011 Oops. Thanks!

Jonathan M Davis (7/97) Jul 18 2011 I suggest that you do what I did in std.file (e.g. with getTimesWin). I ...
Steven Schveighoffer (53/106) Jul 18 2011 It is and it isn't. It's *not* a normal directory, because only shares ...

Lars T. Kyllingstad (27/162) Jul 20 2011 Then driveName() should probably return the full share path. But, of th...

Steven Schveighoffer (15/104) Jul 20 2011 It is in that if you open explorer and type in \\servername, it will giv...

Lars T. Kyllingstad (6/9) Jul 20 2011 Any .NET programmers out there? Can you please tell me what the

Jussi Jumppanen (36/42) Jul 21 2011 This code:

Lars T. Kyllingstad (13/61) Jul 21 2011 Thanks, this is very helpful. Now we know that MS's APIs treat \\foo\ba...

Rainer Schuetze (7/10) Jul 21 2011 If that's true for the bare open() without going through possible

Lars T. Kyllingstad (4/17) Jul 21 2011 All right, I'll remove "//path" support again. That simplifies things

Lars T. Kyllingstad (6/112) Jul 20 2011 Actually, I realise now, it doesn't. :) Since joinPath/buildPath needs
Rainer Schuetze (5/36) Jul 21 2011 I like the direction that this is heading. If the idea gets extended to

Nick Sabalausky (16/41) Jul 19 2011 I don't know whether or not it's "never" a valid path, but "dir \\server...

Andrej Mitrovic (2/2) Jul 19 2011 Here's some relevant info:
Steven Schveighoffer (21/32) Jul 20 2011 I've done it before, mounted a windows share on a linux box via cifs.
Lars T. Kyllingstad (9/66) Jul 20 2011 That check would probably be orders of magnitude more expensive than a

Alix Pexton (6/11) Jul 20 2011 Wikipedia says Windows long file names are up to 255 UTF-16 characters

Lars T. Kyllingstad (3/17) Jul 21 2011 Thanks! In other words, fcmp() needs to do UTF-16 decoding...

Rainer Schuetze (5/9) Jul 20 2011 I just tried a few examples: Using umlauts works as expected, i.e. upper...

torhu (6/10) Jul 21 2011 I guess you've already thought about this, but one solution is to just

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

Based on your comments, I have made some changes to my std.path 
proposal.  A list of the changes I have made can be found at the 
following address (look at the commits dated 2011-07-17):

  https://github.com/kyllingstad/phobos/commits/std-path

I believe I have covered most of your requests, with a few exceptions:

Firstly, Jonathan argued very convincingly that the contents of the 
current std.path should be put back in, marked as "scheduled for 
deprecation".  I intend to do this when the review is over, if my 
submission gets accepted.  For now, ignore the bottommost deprecated: 
block.

Secondly, David and Jonathan suggested I optimise functions like 
setExtension() using ~= to append when possible.  I have tried doing so 
for setExtension(), and I'm not convinced the extra complexity is worth 
the relatively modest gain.  The specialised, optimised version can be 
found here:

  https://github.com/kyllingstad/phobos/blob/std-path/std/path.d#L529

Finally, there are some requests with which I don't personally agree.  
Therefore, I'd like to get more opinions before making any changes:

- Should I add toNativePath(), which replaces '/' with '\' on Windows and 
vice versa on POSIX?

- Should it be specified/documented whether a function returns "" or 
null?  Specifically, is it important that

    extension("foo") is null
    extension("foo.") !is null && extension("foo.") == ""

- Do people agree with Jonathan's views on function names?


As before, code and docs can be found here:

https://github.com/kyllingstad/phobos/blob/std-path/std/path.d
http://www.kyllingen.net/code/new-std-path/phobos-prerelease/std_path.html

-Lars

Jul 17 2011

bearophile <bearophileHUGS lycos.com> writes:

Lars T. Kyllingstad:

 I believe I have covered most of your requests, with a few exceptions:

compatibleStrings is a template still.

Bye,
bearophile

Jul 17 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 17 Jul 2011 17:43:42 -0400, bearophile wrote:

 Lars T. Kyllingstad:
 
 I believe I have covered most of your requests, with a few exceptions:

 
 compatibleStrings is a template still.

I know.  Sorry, forgot to mention that.  For now, I'd like to keep it the 
way it is.  I can't find any precedence in Phobos for turning these kinds 
of tests into CTFEable functions, and if compatibleStrings were to end up 
in std.traits, for instance, it would stand out as being different from 
everything else in there.  If it is decided that it is better to write 
these tests as ordinary functions, that should probably be done 
throughout Phobos.

-Lars

Jul 18 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 18 July 2011 09:35:17 Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 17:43:42 -0400, bearophile wrote:
 Lars T. Kyllingstad:
 I believe I have covered most of your requests, with a few exceptions:

 compatibleStrings is a template still.

 
 I know.  Sorry, forgot to mention that.  For now, I'd like to keep it the
 way it is.  I can't find any precedence in Phobos for turning these kinds
 of tests into CTFEable functions, and if compatibleStrings were to end up
 in std.traits, for instance, it would stand out as being different from
 everything else in there.  If it is decided that it is better to write
 these tests as ordinary functions, that should probably be done
 throughout Phobos.

And it _should_ be a template. All of the stuff like that are templates. And 
I'm not even sure that it _can_ be a function. And even if it can, what would 
we gain by making it a function anyway? It's operating on types. It's of no 
use at runtime. It's a perfect candidate for an eponymous template. 
std.traits, std.range, etc. do this sort of thing in pretty much exactly the 
same way. There may be a cleaner way to write it then it currently is, but 
using an eponymous template like that is the correct thing to do.

- Jonathan M Davis

Jul 18 2011

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

 And it _should_ be a template. All of the stuff like that are templates. And 
 I'm not even sure that it _can_ be a function. And even if it can, what would 
 we gain by making it a function anyway? It's operating on types. It's of no 
 use at runtime. It's a perfect candidate for an eponymous template. 
 std.traits, std.range, etc. do this sort of thing in pretty much exactly the 
 same way. There may be a cleaner way to write it then it currently is, but 
 using an eponymous template like that is the correct thing to do.

This seems to work:


import std.traits: isSomeChar, Unqual, isSomeString;

bool compatibleStrings(Strings...)() if (Strings.length) {
    static if (isSomeString!(Strings[0])) {
        alias Unqual!(typeof(Strings[0].init[0])) TC;
        foreach (s; Strings[1 .. $])
            static if (isSomeString!s && !is(TC == Unqual!(typeof(s.init[0]))))
                return false;
        return true;
    } else
        return false;
}

version (unittest) {
    static assert (compatibleStrings!(char[], const(char)[], string)());
    static assert (compatibleStrings!(wchar[], const(wchar)[], wstring)());
    static assert (compatibleStrings!(dchar[], const(dchar)[], dstring)());
    static assert (!compatibleStrings!(int[], const(int)[],
immutable(int)[])());
    static assert (!compatibleStrings!(char[], wchar[])());
    static assert (!compatibleStrings!(char[], dstring)());
}

void main() {}

I have written tons of such things in dlibs1, and generally I have seen that
recursive templates are slower and need more RAM than similar functions.

Bye,
bearophile

Jul 18 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday 18 July 2011 06:28:50 bearophile wrote:
 Jonathan M Davis:
 And it _should_ be a template. All of the stuff like that are templates.
 And I'm not even sure that it _can_ be a function. And even if it can,
 what would we gain by making it a function anyway? It's operating on
 types. It's of no use at runtime. It's a perfect candidate for an
 eponymous template. std.traits, std.range, etc. do this sort of thing
 in pretty much exactly the same way. There may be a cleaner way to
 write it then it currently is, but using an eponymous template like
 that is the correct thing to do.

 
 This seems to work:
 
 
 import std.traits: isSomeChar, Unqual, isSomeString;
 
 bool compatibleStrings(Strings...)() if (Strings.length) {
     static if (isSomeString!(Strings[0])) {
         alias Unqual!(typeof(Strings[0].init[0])) TC;
         foreach (s; Strings[1 .. $])
             static if (isSomeString!s && !is(TC ==
 Unqual!(typeof(s.init[0])))) return false;
         return true;
     } else
         return false;
 }
 
 version (unittest) {
     static assert (compatibleStrings!(char[], const(char)[], string)());
     static assert (compatibleStrings!(wchar[], const(wchar)[], wstring)());
     static assert (compatibleStrings!(dchar[], const(dchar)[], dstring)());
     static assert (!compatibleStrings!(int[], const(int)[],
 immutable(int)[])()); static assert (!compatibleStrings!(char[],
 wchar[])());
     static assert (!compatibleStrings!(char[], dstring)());
 }
 
 void main() {}
 
 I have written tons of such things in dlibs1, and generally I have seen that
 recursive templates are slower and need more RAM than similar functions.

Okay. Yes, you could do that. But what you're doing is basically the same as 
the eponymous template except that it's saving the value to in a function so 
that it can be called at runtime. The gain is 0 and potentially confusing. 
It's no better than

bool compatibleStringsFunc(Strings...)()
{
	enum retval = compatibleStrings!Strings;
	return retval;
}

But you _did_ find a way to turn it into a function.

- Jonathan M Davis

Jul 18 2011

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

 But what you're doing is basically the same as 
 the eponymous template except that it's saving the value to in a function so 
 that it can be called at runtime. The gain is 0 and potentially confusing.
 It's no better than
 
 bool compatibleStringsFunc(Strings...)()
 {
 	enum retval = compatibleStrings!Strings;
 	return retval;
 }

The gain of my version is that it doesn't generate tons of templates. From my
experience such functions lead to faster compile times and less memory used by
the compiler compared to using recursive templates. And for me a foreach is
usually less confusing than recursive templates :-)

Bye,
bearophile

Jul 18 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 - Should I add toNativePath(), which replaces '/' with '\' on Windows and
 vice versa on POSIX?

Actually I withdraw that feature request. Some tools will work with
only forward slashes, others only backward slashes, but this is
regardless of what platform they're on.

E.g. some tools don't work with forward slashes, while GIT doesn't
work with backward slashes when running on Windows.

I think .replace(r"\", "/") and .replace("/", r"\") are good enough,
but maybe an alias to each version wouldn't be bad. E.g.
"toForwardSlash" and "toBackslash". It's not hard to define this in
our own code, so it's not really a feature request.

Jul 17 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 00:24:30 +0200, Andrej Mitrovic wrote:

 On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 - Should I add toNativePath(), which replaces '/' with '\' on Windows
 and vice versa on POSIX?

 
 Actually I withdraw that feature request. Some tools will work with only
 forward slashes, others only backward slashes, but this is regardless of
 what platform they're on.

Noted. :)

-Lars

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 17 Jul 2011 18:24:30 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 - Should I add toNativePath(), which replaces '/' with '\' on Windows  
 and
 vice versa on POSIX?

 Actually I withdraw that feature request. Some tools will work with
 only forward slashes, others only backward slashes, but this is
 regardless of what platform they're on.

 E.g. some tools don't work with forward slashes, while GIT doesn't
 work with backward slashes when running on Windows.

 I think .replace(r"\", "/") and .replace("/", r"\") are good enough,
 but maybe an alias to each version wouldn't be bad. E.g.
 "toForwardSlash" and "toBackslash". It's not hard to define this in
 our own code, so it's not really a feature request.


Hum... I wonder if normalize should do this...

Is normalize supposed to create a canonical path?  If so, then this needs  
to happen.

-Steve

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 13:26:08 -0400, Steven Schveighoffer wrote:

 On Sun, 17 Jul 2011 18:24:30 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:
 
 On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 - Should I add toNativePath(), which replaces '/' with '\' on Windows
 and
 vice versa on POSIX?

 Actually I withdraw that feature request. Some tools will work with
 only forward slashes, others only backward slashes, but this is
 regardless of what platform they're on.

 E.g. some tools don't work with forward slashes, while GIT doesn't work
 with backward slashes when running on Windows.

 I think .replace(r"\", "/") and .replace("/", r"\") are good enough,
 but maybe an alias to each version wouldn't be bad. E.g.
 "toForwardSlash" and "toBackslash". It's not hard to define this in our
 own code, so it's not really a feature request.

 
 
 Hum... I wonder if normalize should do this...
 
 Is normalize supposed to create a canonical path?  If so, then this
 needs to happen.

normalize does this on Windows, where '/' is also a directory separator, 
but not on POSIX, where '\' is an ordinary filename character.

I am not entirely sure what the exact definition of "canonical path" is, 
but according to some it entails resolving symlinks.  normalize does not 
do this, but it does everything else:

  - resolves . and .. to the extent possible
  - collapses redundant directory separators
  - changes '/' to '\' on Windows

-Lars
-Lars

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Jul 2011 14:30:51 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Mon, 18 Jul 2011 13:26:08 -0400, Steven Schveighoffer wrote:

 On Sun, 17 Jul 2011 18:24:30 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 On 7/17/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 - Should I add toNativePath(), which replaces '/' with '\' on Windows
 and
 vice versa on POSIX?

 Actually I withdraw that feature request. Some tools will work with
 only forward slashes, others only backward slashes, but this is
 regardless of what platform they're on.

 E.g. some tools don't work with forward slashes, while GIT doesn't work
 with backward slashes when running on Windows.

 I think .replace(r"\", "/") and .replace("/", r"\") are good enough,
 but maybe an alias to each version wouldn't be bad. E.g.
 "toForwardSlash" and "toBackslash". It's not hard to define this in our
 own code, so it's not really a feature request.


 Hum... I wonder if normalize should do this...

 Is normalize supposed to create a canonical path?  If so, then this
 needs to happen.

 normalize does this on Windows, where '/' is also a directory separator,
 but not on POSIX, where '\' is an ordinary filename character.

 I am not entirely sure what the exact definition of "canonical path" is,
 but according to some it entails resolving symlinks.  normalize does not
 do this, but it does everything else:

   - resolves . and .. to the extent possible
   - collapses redundant directory separators
   - changes '/' to '\' on Windows

OK, this is what I meant.  By canonical path, I mean I should be able to  
take two paths that point to the same filename and normalize should output  
the same string for both.

I agree that the posix version should not replace \ with /, since that's a  
Windows specific issue.

I realize there are some limitations when all you are doing is string  
manipulation.  For example ~steves/blah resolves to the canonical path  
/home/steves/blah.  Same thing with symlinks.

I guess normalize is the best term for it, don't want to confuse it with  
full canonical.

-Steve

Jul 18 2011

Jesse Phillips <jessekphillips+d gmail.com> writes:

On Sun, 17 Jul 2011 21:27:41 +0000, Lars T. Kyllingstad wrote:

 - Should I add toNativePath(), which replaces '/' with '\' on Windows
 and vice versa on POSIX?

I'm not sure my opinion on this. It seems like a useful idea, but as 
Andrej points out it make just cause other issues.
 
 - Should it be specified/documented whether a function returns "" or
 null?  Specifically, is it important that
 
     extension("foo") is null
     extension("foo.") !is null && extension("foo.") == ""

I don't think it is important, but probably should be documented.
 
 - Do people agree with Jonathan's views on function names?

I think I did.

Jul 17 2011

Brian Schott <brian-schott cox.net> writes:

The documentation comments for driveName say that the return value will
be an empty string in some circumstances, but the code and unit tests
both say that the behavior is to return null.

Jul 17 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
 The documentation comments for driveName say that the return value will
 be an empty string in some circumstances, but the code and unit tests
 both say that the behavior is to return null.

The fun part with that is that "" == null and a null string is empty per 
std.array.empty, so it _is_ the empty string. The only difference is that "" 
!is null. So, if the function says that it returns null, then it needs to 
return null. Since it says that it returns the empty string, it could return 
either.

Now, in spite of all that, there's still a problem since the tests verify that 
the return value is null, not empty. Either the documentation should say that 
it returns null, or the tests should be checking for empty, not null. But 
still, the documentation isn't incorrect. Are the tests are perfectly valid, 
but they really shouldn't be testing for is null instead of empty when the 
function is supposed to return empty.

- Jonathan M Davis

Jul 17 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:

 On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
 The documentation comments for driveName say that the return value will
 be an empty string in some circumstances, but the code and unit tests
 both say that the behavior is to return null.

 
 The fun part with that is that "" == null and a null string is empty per
 std.array.empty, so it _is_ the empty string. The only difference is
 that "" !is null. So, if the function says that it returns null, then it
 needs to return null. Since it says that it returns the empty string, it
 could return either.
 
 Now, in spite of all that, there's still a problem since the tests
 verify that the return value is null, not empty. Either the
 documentation should say that it returns null, or the tests should be
 checking for empty, not null. But still, the documentation isn't
 incorrect. Are the tests are perfectly valid, but they really shouldn't
 be testing for is null instead of empty when the function is supposed to
 return empty.

Pending a decision on the null vs. empty issue, I have now standardised 
on using empty() for testing whether functions return empty strings.

-Lars

Jul 18 2011

torhu <no spam.invalid> writes:

On 18.07.2011 11:42, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:

  On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
  The documentation comments for driveName say that the return value will
  be an empty string in some circumstances, but the code and unit tests
  both say that the behavior is to return null.

  The fun part with that is that "" == null and a null string is empty per
  std.array.empty, so it _is_ the empty string. The only difference is
  that "" !is null. So, if the function says that it returns null, then it
  needs to return null. Since it says that it returns the empty string, it
  could return either.

  Now, in spite of all that, there's still a problem since the tests
  verify that the return value is null, not empty. Either the
  documentation should say that it returns null, or the tests should be
  checking for empty, not null. But still, the documentation isn't
  incorrect. Are the tests are perfectly valid, but they really shouldn't
  be testing for is null instead of empty when the function is supposed to
  return empty.

 Pending a decision on the null vs. empty issue, I have now standardised
 on using empty() for testing whether functions return empty strings.

I'd like to make a case for null as the 'nothing here' value.

The advantage of using null is that all possible ways of testing for 
'nothingness' (is, ==, as a boolean condition, empty range) will work. 
But if you return an empty string, you can't do 'str is null', because 
that will be false.  With null there's just no doubt, and no way to get 
the test wrong.

As far as I can tell by the testing I've done, you can use a null string 
in every way that you can use an empty string, even append to it with 
~=.   The distinction between null and empty strings is significant in C 
and Java, but in D it's not, and the tiny difference that actually 
exists mainly serves to confuse people.  It doesn't help that the actual 
differences are largely undocumented either.

One difference is that a statically allocated empty string is null 
terminated, but I think that can be safely ignored in the case of return 
values.

By the way, did you read my post in the other thread?

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 14:23:18 +0200, torhu wrote:

 On 18.07.2011 11:42, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:

  On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
  The documentation comments for driveName say that the return value
  will be an empty string in some circumstances, but the code and unit
  tests both say that the behavior is to return null.

  The fun part with that is that "" == null and a null string is empty
  per std.array.empty, so it _is_ the empty string. The only difference
  is that "" !is null. So, if the function says that it returns null,
  then it needs to return null. Since it says that it returns the empty
  string, it could return either.

  Now, in spite of all that, there's still a problem since the tests
  verify that the return value is null, not empty. Either the
  documentation should say that it returns null, or the tests should be
  checking for empty, not null. But still, the documentation isn't
  incorrect. Are the tests are perfectly valid, but they really
  shouldn't be testing for is null instead of empty when the function
  is supposed to return empty.

 Pending a decision on the null vs. empty issue, I have now standardised
 on using empty() for testing whether functions return empty strings.

 
 I'd like to make a case for null as the 'nothing here' value.
 
 The advantage of using null is that all possible ways of testing for
 'nothingness' (is, ==, as a boolean condition, empty range) will work.
 But if you return an empty string, you can't do 'str is null', because
 that will be false.  With null there's just no doubt, and no way to get
 the test wrong.
 
 As far as I can tell by the testing I've done, you can use a null string
 in every way that you can use an empty string, even append to it with
 ~=.   The distinction between null and empty strings is significant in C
 and Java, but in D it's not, and the tiny difference that actually
 exists mainly serves to confuse people.  It doesn't help that the actual
 differences are largely undocumented either.
 
 One difference is that a statically allocated empty string is null
 terminated, but I think that can be safely ignored in the case of return
 values.

True, but the question was not whether one should use null or "" for the 
"nothing here" return value of a function.  The question was whether the 
function returning null should mean something different than it returning 
"".


 By the way, did you read my post in the other thread?

Yes, I read it, but I forgot to answer.  Sorry about that.  I've answered 
now.

-Lars

Jul 18 2011

torhu <no spam.invalid> writes:

On 18.07.2011 16:18, Lars T. Kyllingstad wrote:
 On Mon, 18 Jul 2011 14:23:18 +0200, torhu wrote:
  I'd like to make a case for null as the 'nothing here' value.

  The advantage of using null is that all possible ways of testing for
  'nothingness' (is, ==, as a boolean condition, empty range) will work.
  But if you return an empty string, you can't do 'str is null', because
  that will be false.  With null there's just no doubt, and no way to get
  the test wrong.

  As far as I can tell by the testing I've done, you can use a null string
  in every way that you can use an empty string, even append to it with
  ~=.   The distinction between null and empty strings is significant in C
  and Java, but in D it's not, and the tiny difference that actually
  exists mainly serves to confuse people.  It doesn't help that the actual
  differences are largely undocumented either.

  One difference is that a statically allocated empty string is null
  terminated, but I think that can be safely ignored in the case of return
  values.

 True, but the question was not whether one should use null or "" for the
 "nothing here" return value of a function.  The question was whether the
 function returning null should mean something different than it returning
 "".

I meant to imply that null and empty should not be used to mean two 
different things, sorry if I didn't make myself clear.  AFAIK, none of 
the Phobos functions that take string arguments care about the 
difference.  If the length is zero, the pointer value is ignored.  In 
light of this, I don't know what different meanings null and empty would 
or should have.

Jul 18 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On 2011-07-18 10:51, torhu wrote:
 On 18.07.2011 16:18, Lars T. Kyllingstad wrote:
 On Mon, 18 Jul 2011 14:23:18 +0200, torhu wrote:
 I'd like to make a case for null as the 'nothing here' value.
 
 The advantage of using null is that all possible ways of testing for
 'nothingness' (is, ==, as a boolean condition, empty range) will work.
 But if you return an empty string, you can't do 'str is null', because
 that will be false. With null there's just no doubt, and no way to get
 the test wrong.
 
 As far as I can tell by the testing I've done, you can use a null
 string in every way that you can use an empty string, even append to
 it with ~=. The distinction between null and empty strings is
 significant in C and Java, but in D it's not, and the tiny difference
 that actually exists mainly serves to confuse people. It doesn't help
 that the actual differences are largely undocumented either.
 
 One difference is that a statically allocated empty string is null
 terminated, but I think that can be safely ignored in the case of
 return values.

 
 True, but the question was not whether one should use null or "" for the
 "nothing here" return value of a function. The question was whether the
 function returning null should mean something different than it returning
 "".

 
 I meant to imply that null and empty should not be used to mean two
 different things, sorry if I didn't make myself clear. AFAIK, none of
 the Phobos functions that take string arguments care about the
 difference. If the length is zero, the pointer value is ignored. In
 light of this, I don't know what different meanings null and empty would
 or should have.

There are definitely situations where it is valuable to differentiate between 
null and empty, but in the case of D arrays, they really aren't designed for 
it, because nearly everything in the language treats them as being the same 
thing. There may be some value in differentiating them in spite of that, but 
it doesn't generally work very well. One of the few places would be the return 
value of a function. So, if there could reasonably be a difference between "" 
and null for the return value of a function, then it could be reasonable to 
null mean something different than "". But the truth is that that's going to 
be error prone, because people are likely to use == null instead of is null, 
not realizing that == null doesn't do what they want (in fact, arguably, == 
null merits a warning). So, if there's no clear gain in returning null, the 
documentation should just say that it returns empty, and then it doesn't 
matter whether it returns "" or null. It _is_ a bit of a conundrum though. I'm 
not sure that making null and "" virtually identical was ultimately a good 
idea, but we're stuck with it at this point.

- Jonathan M Davis

Jul 18 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/18/11 7:23 AM, torhu wrote:
 On 18.07.2011 11:42, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:

 On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
 The documentation comments for driveName say that the return value will
 be an empty string in some circumstances, but the code and unit tests
 both say that the behavior is to return null.

 The fun part with that is that "" == null and a null string is empty per
 std.array.empty, so it _is_ the empty string. The only difference is
 that "" !is null. So, if the function says that it returns null, then it
 needs to return null. Since it says that it returns the empty string, it
 could return either.

 Now, in spite of all that, there's still a problem since the tests
 verify that the return value is null, not empty. Either the
 documentation should say that it returns null, or the tests should be
 checking for empty, not null. But still, the documentation isn't
 incorrect. Are the tests are perfectly valid, but they really shouldn't
 be testing for is null instead of empty when the function is supposed to
 return empty.

 Pending a decision on the null vs. empty issue, I have now standardised
 on using empty() for testing whether functions return empty strings.

 I'd like to make a case for null as the 'nothing here' value.

 The advantage of using null is that all possible ways of testing for
 'nothingness' (is, ==, as a boolean condition, empty range) will work.
 But if you return an empty string, you can't do 'str is null', because
 that will be false. With null there's just no doubt, and no way to get
 the test wrong.

Note that there are two aspects: generating 'nothing here' values, and 
testing for 'nothing here'.

In keeping with the "be generous with what you receive and conservative 
with what you send" mantra, good functions should test string inputs 
with str.empty and return null strings.


Andrei

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 09:38:08 -0500, Andrei Alexandrescu wrote:

 On 7/18/11 7:23 AM, torhu wrote:
 On 18.07.2011 11:42, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:

 On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
 The documentation comments for driveName say that the return value
 will be an empty string in some circumstances, but the code and unit
 tests both say that the behavior is to return null.

 The fun part with that is that "" == null and a null string is empty
 per std.array.empty, so it _is_ the empty string. The only difference
 is that "" !is null. So, if the function says that it returns null,
 then it needs to return null. Since it says that it returns the empty
 string, it could return either.

 Now, in spite of all that, there's still a problem since the tests
 verify that the return value is null, not empty. Either the
 documentation should say that it returns null, or the tests should be
 checking for empty, not null. But still, the documentation isn't
 incorrect. Are the tests are perfectly valid, but they really
 shouldn't be testing for is null instead of empty when the function
 is supposed to return empty.

 Pending a decision on the null vs. empty issue, I have now
 standardised on using empty() for testing whether functions return
 empty strings.

 I'd like to make a case for null as the 'nothing here' value.

 The advantage of using null is that all possible ways of testing for
 'nothingness' (is, ==, as a boolean condition, empty range) will work.
 But if you return an empty string, you can't do 'str is null', because
 that will be false. With null there's just no doubt, and no way to get
 the test wrong.

 
 Note that there are two aspects: generating 'nothing here' values, and
 testing for 'nothing here'.

Some have argued that there is an extra dimension to this, namely the 
distinction between "nothing here" and "something here, but that 
something is an empty string".  I am not convinced we should make that 
distinction.


 In keeping with the "be generous with what you receive and conservative
 with what you send" mantra, good functions should test string inputs
 with str.empty and return null strings.

The specific example which spurred the debate was the following:

While there is no doubt that extension("foo") should return null, 
Vladimir Panteleev argued that extension("foo.") should be *specified* to 
return "" (specifically, an empty slice from the end of the input string) 
to indicate that there is an "empty extension".

I disagree, I don't think null and "" should have different semantics.  
The fact that extension() currently *does* behave as Vladimir wants is, 
in my opinion, an implementation detail.

Note that extension() seems to be the only function for which the 
controversy has arisen so far, so it may not be worth taking this 
discussion too far.

-Lars

Jul 18 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Mon, 18 Jul 2011 18:07:12 +0300, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 The fact that extension() currently *does* behave as Vladimir wants is,
 in my opinion, an implementation detail.

Is it still an implementation detail if it's documented behavior?

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Jul 18 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

Sorry, I thought you meant the old getExt().

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Jul 2011 08:23:18 -0400, torhu <no spam.invalid> wrote:

 On 18.07.2011 11:42, Lars T. Kyllingstad wrote:
 On Sun, 17 Jul 2011 22:38:43 -0700, Jonathan M Davis wrote:

  On Sunday 17 July 2011 22:08:27 Brian Schott wrote:
  The documentation comments for driveName say that the return value  
 will
  be an empty string in some circumstances, but the code and unit tests
  both say that the behavior is to return null.

  The fun part with that is that "" == null and a null string is empty  
 per
  std.array.empty, so it _is_ the empty string. The only difference is
  that "" !is null. So, if the function says that it returns null, then  
 it
  needs to return null. Since it says that it returns the empty string,  
 it
  could return either.

  Now, in spite of all that, there's still a problem since the tests
  verify that the return value is null, not empty. Either the
  documentation should say that it returns null, or the tests should be
  checking for empty, not null. But still, the documentation isn't
  incorrect. Are the tests are perfectly valid, but they really  
 shouldn't
  be testing for is null instead of empty when the function is supposed  
 to
  return empty.

 Pending a decision on the null vs. empty issue, I have now standardised
 on using empty() for testing whether functions return empty strings.

 I'd like to make a case for null as the 'nothing here' value.

 The advantage of using null is that all possible ways of testing for  
 'nothingness' (is, ==, as a boolean condition, empty range) will work.  
 But if you return an empty string, you can't do 'str is null', because  
 that will be false.  With null there's just no doubt, and no way to get  
 the test wrong.

The one that's kind of nice is the if(path.extension), which reads not  
only much better than if(path.extension == null), but it's a very common  
idiom in many languages (using if to test a string's emptiness).  People  
are likely to get this wrong (in fact, it may make sense for *all* empty  
arrays to evaluate as false for an if condition).

I personally think if there's no real difference, returning null is the  
better option based on these points.

However, if there is some performance/maintenance advantage to not  
returning null, then just return an empty non-null array and specify in  
the API docs that the function returns an empty string.

-Steve

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 17 Jul 2011 21:27:41 +0000, Lars T. Kyllingstad wrote:

 Based on your comments, I have made some changes to my std.path
 proposal.  A list of the changes I have made can be found at the
 following address (look at the commits dated 2011-07-17):
 
   https://github.com/kyllingstad/phobos/commits/std-path
 
 I believe I have covered most of your requests, with a few exceptions:

It seems I forgot about the CTFEability tests.  I'll fix that too, and 
push the updated code later today.

-Lars

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 10:05:07 +0000, Lars T. Kyllingstad wrote:

 On Sun, 17 Jul 2011 21:27:41 +0000, Lars T. Kyllingstad wrote:
 
 Based on your comments, I have made some changes to my std.path
 proposal.  A list of the changes I have made can be found at the
 following address (look at the commits dated 2011-07-17):
 
   https://github.com/kyllingstad/phobos/commits/std-path
 
 I believe I have covered most of your requests, with a few exceptions:

 
 It seems I forgot about the CTFEability tests.  I'll fix that too, and
 push the updated code later today.

Done.  Most functions were CTFEable without any modifications (thanks, 
Don!). :)

The exceptions are relativePath (because of std.algorithm.cmp) and 
expandTilde (which is strictly a run-time function).

-Lars

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 17 Jul 2011 17:27:41 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 Based on your comments, I have made some changes to my std.path
 proposal.  A list of the changes I have made can be found at the
 following address (look at the commits dated 2011-07-17):

   https://github.com/kyllingstad/phobos/commits/std-path

This is a review of the docs/design.  I'll review the code separately:

basename's standards section says:

(with suitable adaptions for Windows paths)

adaptions => adaptations

This occurs twice.

In driveName:

Should std.path handle uunc paths?  i.e. \\servername\share\path  (I think  
if it does, it should specify \\servername\share as the drive)

joinPath:

Does this normalize the paths?  For example:

joinPath("/home/steves", "../lars") => /home/steves/../lars or /home/lars ?

If so, the docs should reflect that.  If not, maybe it should :)  If it  
doesn't, at least the docs should state that it doesn't.

pathSplitter:

I think this should be a bi-directional range (no technical limitation I  
can think of).

fcmp:
"On Windows, fcmp is an alias for std.string.icmp, which yields a case  
insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp,  
i.e. a case sensitive comparison."

What about comparing c:/foo with c:\foo?  This isn't going to be equal  
with icmp.

expandTilde:

I've commented on expandTilde from the other posts, but if it is kept a  
posix-only function, the documentation should reflect that.

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:

 On Sun, 17 Jul 2011 17:27:41 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 Based on your comments, I have made some changes to my std.path
 proposal.  A list of the changes I have made can be found at the
 following address (look at the commits dated 2011-07-17):

   https://github.com/kyllingstad/phobos/commits/std-path

 
 This is a review of the docs/design.  I'll review the code separately:
 
 basename's standards section says:
 
 (with suitable adaptions for Windows paths)
 
 adaptions => adaptations

Oops.  Thanks!

 
 This occurs twice.

Copy+paste. :)

 
 In driveName:
 
 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

Yes, std.path is supposed to support UNC paths.  For instance, the 
following works now:

  assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"]));

I guess you would rather have that

  assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

then?  I am not very familiar with Windows network shares; is \\foo never 
a valid path on its own?

As I understand it, some POSIX systems also mount network drives using 
similar paths.  Does anyone know whether "//foo" is a valid path on these 
systems, or does it have to bee "//foo/bar"?


 joinPath:
 
 Does this normalize the paths?  For example:
 
 joinPath("/home/steves", "../lars") => /home/steves/../lars or
 /home/lars ?
 
 If so, the docs should reflect that.  If not, maybe it should :)  If it
 doesn't, at least the docs should state that it doesn't.

No, it doesn't, and I don't think it should.  It is better to let the 
user choose whether they want the overhead of normalization by calling 
normalize() explicitly.  I will specify this in the docs.


 pathSplitter:
 
 I think this should be a bi-directional range (no technical limitation I
 can think of).

It is more of a complexity vs. benefit thing, but as you are the second 
person to ask for this, I will look into it.  A convincing use case would 
be nice, though. :)


 fcmp:
 "On Windows, fcmp is an alias for std.string.icmp, which yields a case
 insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp,
 i.e. a case sensitive comparison."
 
 What about comparing c:/foo with c:\foo?  This isn't going to be equal
 with icmp.

I am a bit unsure what to do about the comparison functions (fcmp, 
pathCharMatch and globMatch).  Aside from the issue with directory 
separators it is, as was pointed out by someone else, entirely possible 
to mount case-sensitive file systems on Windows and case-insensitive file 
systems on POSIX.  (The latter is not uncommon on OSX, I believe.)  I am 
open to suggestions.


 expandTilde:
 
 I've commented on expandTilde from the other posts, but if it is kept a
 posix-only function, the documentation should reflect that.

It does; look at the "Returns" section.  Perhaps it should be moved to a 
more prominent location?

-Lars

Jul 18 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On 2011-07-18 11:25, Lars T. Kyllingstad wrote:
 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:
 On Sun, 17 Jul 2011 17:27:41 -0400, Lars T. Kyllingstad
 
 <public kyllingen.nospamnet> wrote:
 Based on your comments, I have made some changes to my std.path
 proposal. A list of the changes I have made can be found at the
 
 following address (look at the commits dated 2011-07-17):
 https://github.com/kyllingstad/phobos/commits/std-path

 
 This is a review of the docs/design. I'll review the code separately:
 
 basename's standards section says:
 
 (with suitable adaptions for Windows paths)
 
 adaptions => adaptations

 
 Oops. Thanks!
 
 This occurs twice.

 
 Copy+paste. :)
 
 In driveName:
 
 Should std.path handle uunc paths? i.e. \\servername\share\path (I
 think if it does, it should specify \\servername\share as the drive)

 
 Yes, std.path is supposed to support UNC paths. For instance, the
 following works now:
 
 assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"]));
 
 I guess you would rather have that
 
 assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));
 
 then? I am not very familiar with Windows network shares; is \\foo never
 a valid path on its own?
 
 As I understand it, some POSIX systems also mount network drives using
 similar paths. Does anyone know whether "//foo" is a valid path on these
 systems, or does it have to bee "//foo/bar"?
 
 joinPath:
 
 Does this normalize the paths? For example:
 
 joinPath("/home/steves", "../lars") => /home/steves/../lars or
 /home/lars ?
 
 If so, the docs should reflect that. If not, maybe it should :) If it
 doesn't, at least the docs should state that it doesn't.

 
 No, it doesn't, and I don't think it should. It is better to let the
 user choose whether they want the overhead of normalization by calling
 normalize() explicitly. I will specify this in the docs.
 
 pathSplitter:
 
 I think this should be a bi-directional range (no technical limitation I
 can think of).

 
 It is more of a complexity vs. benefit thing, but as you are the second
 person to ask for this, I will look into it. A convincing use case would
 be nice, though. :)
 
 fcmp:
 "On Windows, fcmp is an alias for std.string.icmp, which yields a case
 insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp,
 i.e. a case sensitive comparison."
 
 What about comparing c:/foo with c:\foo? This isn't going to be equal
 with icmp.

 
 I am a bit unsure what to do about the comparison functions (fcmp,
 pathCharMatch and globMatch). Aside from the issue with directory
 separators it is, as was pointed out by someone else, entirely possible
 to mount case-sensitive file systems on Windows and case-insensitive file
 systems on POSIX. (The latter is not uncommon on OSX, I believe.) I am
 open to suggestions.
 
 expandTilde:
 
 I've commented on expandTilde from the other posts, but if it is kept a
 posix-only function, the documentation should reflect that.

 
 It does; look at the "Returns" section. Perhaps it should be moved to a
 more prominent location?

I suggest that you do what I did in std.file (e.g. with getTimesWin). I put 
this at the very top of the ddoc comment:

$(BLUE This function is Windows-Only.)

or if it's only on Posix:

$(BLUE This function is Posix-Only.)

- Jonathan M Davis

Jul 18 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:

 In driveName:

 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

 Yes, std.path is supposed to support UNC paths.  For instance, the
 following works now:

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"]));

 I guess you would rather have that

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

 then?  I am not very familiar with Windows network shares; is \\foo never
 a valid path on its own?

It is and it isn't.  It's *not* a normal directory, because only shares  
can be in that directory.  In other words, the point at which a UNC path  
turns into normal directory structure is after the share name.

An easy way to compare is, you can only map drive letters to shares, not  
to servers.

 As I understand it, some POSIX systems also mount network drives using
 similar paths.  Does anyone know whether "//foo" is a valid path on these
 systems, or does it have to bee "//foo/bar"?

Typically, linux uses URL's, i.e. smb://server/share   URL parsing is  
probably not in std.path's charter.

However, I have used a command like:

mount -t cifs //server/share /mnt/serverfiles

But this is only in very special contexts.  In general I don't think //foo  
should be considered a server path on Posix systems.

 joinPath:

 Does this normalize the paths?  For example:

 joinPath("/home/steves", "../lars") => /home/steves/../lars or
 /home/lars ?

 If so, the docs should reflect that.  If not, maybe it should :)  If it
 doesn't, at least the docs should state that it doesn't.

 No, it doesn't, and I don't think it should.  It is better to let the
 user choose whether they want the overhead of normalization by calling
 normalize() explicitly.  I will specify this in the docs.

In fact, if you do not normalize during the join, it's *more* overhead to  
normalize afterwards.  If normalization is done while joining, then you  
only build one string.  There's no need to build a non-normalized string,  
then build a normalized string based on that.

Plus the data is only iterated once.

I think it's at least worth an option, but I'm not going to hold back my  
vote based on this :)

 pathSplitter:

 I think this should be a bi-directional range (no technical limitation I
 can think of).

 It is more of a complexity vs. benefit thing, but as you are the second
 person to ask for this, I will look into it.  A convincing use case would
 be nice, though. :)

Well a path is more like a stack than a queue.  You are usually operating  
more on the back side of it.  To provide back and popBack makes a lot of  
sense to me.  For example, to implement the command cd ../foo, you need to  
popBack the topmost directory.

 fcmp:
 "On Windows, fcmp is an alias for std.string.icmp, which yields a case
 insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp,
 i.e. a case sensitive comparison."

 What about comparing c:/foo with c:\foo?  This isn't going to be equal
 with icmp.

 I am a bit unsure what to do about the comparison functions (fcmp,
 pathCharMatch and globMatch).  Aside from the issue with directory
 separators it is, as was pointed out by someone else, entirely possible
 to mount case-sensitive file systems on Windows and case-insensitive file
 systems on POSIX.  (The latter is not uncommon on OSX, I believe.)  I am
 open to suggestions.

It's definitely something to think about.  At the very least, I think the  
default file system case sensitivity should be mapped to a certain  
function.  It doesn't hurt to expose the opposite sensitivity as an  
alternate (you need to implement both anyway).  A template with all  
options defaulted for the current OS makes good sense I think.  Actually,  
expanding/renaming pathCharMatch provides a perfect way to default these:

e.g.:
version(Windows)
{
    enum defaultOSSensitivity = false;
    enum defaultOSDirSeps = `\/`;
}
else version(Posix)
{
    enum defaultOSSensitivity = true;
    enum defaultOSDirSeps = "/";
}

// replaces pathCharMatch
int pathCharCmp(bool caseSensitive = defaultOSSensitivity, string dirseps  
= defaultOSDirSeps)(dchar a, dchar b);

int fcmp(alias pred = "pathCharCmp(a, b)", S1, S2)(S1 filename1, S2  
filename2);

Anyone who wants to do alternate comparisons is free to do so using other  
options from pathCharCmp.

 expandTilde:

 I've commented on expandTilde from the other posts, but if it is kept a
 posix-only function, the documentation should reflect that.

 It does; look at the "Returns" section.  Perhaps it should be moved to a
 more prominent location?

Yes.  It should say (Posix-only).  I believe technically that it should  
fail to compile on Windows if it does not map to a "home" directory  
there.  Note that as named, it's possible to confuse with expanding the  
DOS 8.3 name of a file, i.e. Progra~1

-Steve

Jul 18 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:

 On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:

 
 In driveName:

 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

 Yes, std.path is supposed to support UNC paths.  For instance, the
 following works now:

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar",
   "baz"]));

 I guess you would rather have that

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

 then?  I am not very familiar with Windows network shares; is \\foo
 never a valid path on its own?

 
 It is and it isn't.

Well, that certainly cleared things up. ;)


 It's *not* a normal directory, because only shares
 can be in that directory.  In other words, the point at which a UNC path
 turns into normal directory structure is after the share name.
 
 An easy way to compare is, you can only map drive letters to shares, not
 to servers.

Then driveName() should probably return the full share path.  But, of the 
following asserts, which should pass?

    assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`);
    assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo`);

    assert (baseName(`\\foo\bar`) == `\\foo\bar`);
    assert (baseName(`\\foo\bar`) == "bar");

    assert (dirName(`\\foo\bar`) == `\\foo\bar`);
    assert (dirName(`\\foo\bar`) == `\\foo`);

Note that if you replace `\\foo\bar` with `c:\` in the above, the first 
assert in each pair will pass.  Same with "/" on POSIX.  Basically, that 
choice corresponds to treating `\\foo\bar` as a filesystem root.


 As I understand it, some POSIX systems also mount network drives using
 similar paths.  Does anyone know whether "//foo" is a valid path on
 these systems, or does it have to bee "//foo/bar"?

 
 Typically, linux uses URL's, i.e. smb://server/share   URL parsing is
 probably not in std.path's charter.
 
 However, I have used a command like:
 
 mount -t cifs //server/share /mnt/serverfiles
 
 But this is only in very special contexts.  In general I don't think
 //foo should be considered a server path on Posix systems.

I actually got a request on the Phobos list that std.path should support 
such paths.  Furthermore, the POSIX stardard explicitly mentions "//" 
paths (though it basically says it is implementation-defined whether to 
bother dealing with them).


 joinPath:

 Does this normalize the paths?  For example:

 joinPath("/home/steves", "../lars") => /home/steves/../lars or
 /home/lars ?

 If so, the docs should reflect that.  If not, maybe it should :)  If
 it doesn't, at least the docs should state that it doesn't.

 No, it doesn't, and I don't think it should.  It is better to let the
 user choose whether they want the overhead of normalization by calling
 normalize() explicitly.  I will specify this in the docs.

 
 In fact, if you do not normalize during the join, it's *more* overhead
 to normalize afterwards.  If normalization is done while joining, then
 you only build one string.  There's no need to build a non-normalized
 string, then build a normalized string based on that.
 
 Plus the data is only iterated once.
 
 I think it's at least worth an option, but I'm not going to hold back my
 vote based on this :)

If it doesn't turn out to be a huge undertaking, I think I'll replace 
joinPath() with a function buildPath() that takes an input range of path 
segments and joins them together, with optional normalization.  Then, 
normalize(path) can be implemented as:

    buildPath(pathSplitter(path));

Does that sound sensible?


 pathSplitter:

 I think this should be a bi-directional range (no technical limitation
 I can think of).

 It is more of a complexity vs. benefit thing, but as you are the second
 person to ask for this, I will look into it.  A convincing use case
 would be nice, though. :)

 
 Well a path is more like a stack than a queue.  You are usually
 operating more on the back side of it.  To provide back and popBack
 makes a lot of sense to me.  For example, to implement the command cd
 ../foo, you need to popBack the topmost directory.

Ok, I'll see what I can do about it. :)


 fcmp:
 "On Windows, fcmp is an alias for std.string.icmp, which yields a case
 insensitive comparison. On POSIX, it is an alias for
 std.algorithm.cmp, i.e. a case sensitive comparison."

 What about comparing c:/foo with c:\foo?  This isn't going to be equal
 with icmp.

 I am a bit unsure what to do about the comparison functions (fcmp,
 pathCharMatch and globMatch).  Aside from the issue with directory
 separators it is, as was pointed out by someone else, entirely possible
 to mount case-sensitive file systems on Windows and case-insensitive
 file systems on POSIX.  (The latter is not uncommon on OSX, I believe.)
  I am open to suggestions.

 
 It's definitely something to think about.  At the very least, I think
 the default file system case sensitivity should be mapped to a certain
 function.  It doesn't hurt to expose the opposite sensitivity as an
 alternate (you need to implement both anyway).  A template with all
 options defaulted for the current OS makes good sense I think. 
 Actually, expanding/renaming pathCharMatch provides a perfect way to
 default these:
 
 e.g.:
 version(Windows)
 {
     enum defaultOSSensitivity = false;
     enum defaultOSDirSeps = `\/`;
 }
 else version(Posix)
 {
     enum defaultOSSensitivity = true;
     enum defaultOSDirSeps = "/";
 }
 
 // replaces pathCharMatch
 int pathCharCmp(bool caseSensitive = defaultOSSensitivity, string
 dirseps = defaultOSDirSeps)(dchar a, dchar b);
 
 int fcmp(alias pred = "pathCharCmp(a, b)", S1, S2)(S1 filename1, S2
 filename2);
 
 Anyone who wants to do alternate comparisons is free to do so using
 other options from pathCharCmp.

Good idea.  I'll probably implement something like that.


 expandTilde:

 I've commented on expandTilde from the other posts, but if it is kept
 a posix-only function, the documentation should reflect that.

 It does; look at the "Returns" section.  Perhaps it should be moved to
 a more prominent location?

 
 Yes.  It should say (Posix-only).  I believe technically that it should
 fail to compile on Windows if it does not map to a "home" directory
 there.  Note that as named, it's possible to confuse with expanding the
 DOS 8.3 name of a file, i.e. Progra~1

I agree.  I'll put it inside a version(Posix) block.

-Lars

Jul 20 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 20 Jul 2011 13:36:51 -0400, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:

 On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:

 On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:

 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:

 In driveName:

 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

 Yes, std.path is supposed to support UNC paths.  For instance, the
 following works now:

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar",
   "baz"]));

 I guess you would rather have that

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

 then?  I am not very familiar with Windows network shares; is \\foo
 never a valid path on its own?

 It is and it isn't.

 Well, that certainly cleared things up. ;)

It is in that if you open explorer and type in \\servername, it will give  
you a list of shares you can try.  But I don't think it's a valid *path*,  
except in explorer.  So my intuition is to declare it never a valid path.

I'm not sure how \\server interacts with the low level functions of  
Windows (such as CreateFile).  Some research/experimentation is probably  
warranted.

 It's *not* a normal directory, because only shares
 can be in that directory.  In other words, the point at which a UNC path
 turns into normal directory structure is after the share name.

 An easy way to compare is, you can only map drive letters to shares, not
 to servers.

 Then driveName() should probably return the full share path.  But, of the
 following asserts, which should pass?

     assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`);
     assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo`);

     assert (baseName(`\\foo\bar`) == `\\foo\bar`);
     assert (baseName(`\\foo\bar`) == "bar");

     assert (dirName(`\\foo\bar`) == `\\foo\bar`);
     assert (dirName(`\\foo\bar`) == `\\foo`);

 Note that if you replace `\\foo\bar` with `c:\` in the above, the first
 assert in each pair will pass.  Same with "/" on POSIX.  Basically, that
 choice corresponds to treating `\\foo\bar` as a filesystem root.

Yes, I think this sounds right (pending research/experimentation cited  
above).

 As I understand it, some POSIX systems also mount network drives using
 similar paths.  Does anyone know whether "//foo" is a valid path on
 these systems, or does it have to bee "//foo/bar"?

 Typically, linux uses URL's, i.e. smb://server/share   URL parsing is
 probably not in std.path's charter.

 However, I have used a command like:

 mount -t cifs //server/share /mnt/serverfiles

 But this is only in very special contexts.  In general I don't think
 //foo should be considered a server path on Posix systems.

 I actually got a request on the Phobos list that std.path should support
 such paths.  Furthermore, the POSIX stardard explicitly mentions "//"
 paths (though it basically says it is implementation-defined whether to
 bother dealing with them).

ls //root lists the contents of /root.  I'd guess that opening //root with  
open() would simply open /root.  Given that context, they should not be  
considered to be a server path IMO.

 joinPath:

 Does this normalize the paths?  For example:

 joinPath("/home/steves", "../lars") => /home/steves/../lars or
 /home/lars ?

 If so, the docs should reflect that.  If not, maybe it should :)  If
 it doesn't, at least the docs should state that it doesn't.

 No, it doesn't, and I don't think it should.  It is better to let the
 user choose whether they want the overhead of normalization by calling
 normalize() explicitly.  I will specify this in the docs.

 In fact, if you do not normalize during the join, it's *more* overhead
 to normalize afterwards.  If normalization is done while joining, then
 you only build one string.  There's no need to build a non-normalized
 string, then build a normalized string based on that.

 Plus the data is only iterated once.

 I think it's at least worth an option, but I'm not going to hold back my
 vote based on this :)

 If it doesn't turn out to be a huge undertaking, I think I'll replace
 joinPath() with a function buildPath() that takes an input range of path
 segments and joins them together, with optional normalization.  Then,
 normalize(path) can be implemented as:

     buildPath(pathSplitter(path));

 Does that sound sensible?

That sounds good.

-Steve

Jul 20 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 20 Jul 2011 14:16:04 -0400, Steven Schveighoffer wrote:

 I'm not sure how \\server interacts with the low level functions of
 Windows (such as CreateFile).  Some research/experimentation is probably
 warranted.

Any .NET programmers out there?  Can you please tell me what the 
following functions return?

  System.IO.Path.GetDirectoryName("\\foo\bar")
  System.IO.Path.GetPathRoot("\\foo\bar\baz")

-Lars

Jul 20 2011

Jussi Jumppanen <jussij zeusedit.com> writes:

Lars T. Kyllingstad Wrote:

 On Wed, 20 Jul 2011 14:16:04 -0400, Steven Schveighoffer wrote:

 Any .NET programmers out there?  Can you please tell me what the 
 following functions return?
 
   System.IO.Path.GetDirectoryName("\\foo\bar")
   System.IO.Path.GetPathRoot("\\foo\bar\baz")

This code:
  using System;
  
  namespace Test
  {
     static class Program
     {
        [STAThread]
        static void Main()
        {
           string test;
  
           test =  "\\foo\bar\";
           Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")");
           Console.WriteLine(System.IO.Path.GetDirectoryName(test));
  
           test =  "\\foo\bar";
           Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")");
           Console.WriteLine(System.IO.Path.GetDirectoryName(test));
  
           test =  "\\foo\bar\baz";
           Console.WriteLine("System.IO.Path.GetDirectoryName(" + test + ")");
           Console.WriteLine(System.IO.Path.GetPathRoot(test));
        }
     }
  }

produced this output:

  C:\temp>test.exe
  System.IO.Path.GetDirectoryName(\\foo\bar\)
  \\foo\bar
  System.IO.Path.GetDirectoryName(\\foo\bar)
  
  System.IO.Path.GetDirectoryName(\\foo\bar\baz)
  \\foo\bar

Cheers Jussi

Jul 21 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 21 Jul 2011 03:36:37 -0400, Jussi Jumppanen wrote:

 Lars T. Kyllingstad Wrote:
 
 On Wed, 20 Jul 2011 14:16:04 -0400, Steven Schveighoffer wrote:

 Any .NET programmers out there?  Can you please tell me what the
 following functions return?
 
   System.IO.Path.GetDirectoryName("\\foo\bar")
   System.IO.Path.GetPathRoot("\\foo\bar\baz")

 
 This code:
   using System;
   
   namespace Test
   {
      static class Program
      {
         [STAThread]
         static void Main()
         {
            string test;
   
            test =  "\\foo\bar\";
            Console.WriteLine("System.IO.Path.GetDirectoryName(" + test +
            ")");
            Console.WriteLine(System.IO.Path.GetDirectoryName(test));
   
            test =  "\\foo\bar";
            Console.WriteLine("System.IO.Path.GetDirectoryName(" + test +
            ")");
            Console.WriteLine(System.IO.Path.GetDirectoryName(test));
   
            test =  "\\foo\bar\baz";
            Console.WriteLine("System.IO.Path.GetDirectoryName(" + test +
            ")"); Console.WriteLine(System.IO.Path.GetPathRoot(test));
         }
      }
   }
 
 produced this output:
 
   C:\temp>test.exe
   System.IO.Path.GetDirectoryName(\\foo\bar\) \\foo\bar
   System.IO.Path.GetDirectoryName(\\foo\bar)
   
   System.IO.Path.GetDirectoryName(\\foo\bar\baz) \\foo\bar
 
 Cheers Jussi

Thanks, this is very helpful.  Now we know that MS's APIs treat \\foo\bar 
as a root directory, so we should do the same.  This means that, once I 
get around to implementing it, the following asserts will pass on Windows:

    assert (baseName(`\\foo\bar`) == `\\foo\bar`);
    assert (dirName(`\\foo\bar`) == `\\foo\bar`);
    assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`);

This is analogous to the following on POSIX (where the behaviour mimics 
that of the basename and dirname shell utilities):

    assert (baseName("/") == "/");
    assert (dirName("/") == "/");
    assert (pathSplitter("/").front == "/");

-Lars

Jul 21 2011

Rainer Schuetze <r.sagitario gmx.de> writes:

On 20.07.2011 20:16, Steven Schveighoffer wrote:
 ls //root lists the contents of /root. I'd guess that opening //root
 with open() would simply open /root. Given that context, they should not
 be considered to be a server path IMO.

If that's true for the bare open() without going through possible 
translations in "ls", I'd guess that "//server/share" would look for a 
file/directory "share" in "/server", so std.path should treat it this 
way for posix, too.

Sorry, if my previous comments in the phobos-list caused confusion, I 
must have confused the mount share with a directory specification.

Jul 21 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 21 Jul 2011 09:09:52 +0200, Rainer Schuetze wrote:

 On 20.07.2011 20:16, Steven Schveighoffer wrote:
 ls //root lists the contents of /root. I'd guess that opening //root
 with open() would simply open /root. Given that context, they should
 not be considered to be a server path IMO.

 
 If that's true for the bare open() without going through possible
 translations in "ls", I'd guess that "//server/share" would look for a
 file/directory "share" in "/server", so std.path should treat it this
 way for posix, too.
 
 Sorry, if my previous comments in the phobos-list caused confusion, I
 must have confused the mount share with a directory specification.

All right, I'll remove "//path" support again.  That simplifies things 
for POSIX, at least.

-Lars

Jul 21 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 20 Jul 2011 17:36:51 +0000, Lars T. Kyllingstad wrote:

 On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:
 
 On Mon, 18 Jul 2011 14:25:57 -0400, Lars T. Kyllingstad
 <public kyllingen.nospamnet> wrote:
 
 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:

 
 In driveName:

 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

 Yes, std.path is supposed to support UNC paths.  For instance, the
 following works now:

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar",
   "baz"]));

 I guess you would rather have that

   assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

 then?  I am not very familiar with Windows network shares; is \\foo
 never a valid path on its own?

 
 It is and it isn't.

 
 Well, that certainly cleared things up. ;)
 
 
 It's *not* a normal directory, because only shares can be in that
 directory.  In other words, the point at which a UNC path turns into
 normal directory structure is after the share name.
 
 An easy way to compare is, you can only map drive letters to shares,
 not to servers.

 
 Then driveName() should probably return the full share path.  But, of
 the following asserts, which should pass?
 
     assert (pathSplitter(`\\foo\bar\baz`).front == `\\foo\bar`); assert
     (pathSplitter(`\\foo\bar\baz`).front == `\\foo`);
 
     assert (baseName(`\\foo\bar`) == `\\foo\bar`); assert
     (baseName(`\\foo\bar`) == "bar");
 
     assert (dirName(`\\foo\bar`) == `\\foo\bar`); assert
     (dirName(`\\foo\bar`) == `\\foo`);
 
 Note that if you replace `\\foo\bar` with `c:\` in the above, the first
 assert in each pair will pass.  Same with "/" on POSIX.  Basically, that
 choice corresponds to treating `\\foo\bar` as a filesystem root.
 
 
 As I understand it, some POSIX systems also mount network drives using
 similar paths.  Does anyone know whether "//foo" is a valid path on
 these systems, or does it have to bee "//foo/bar"?

 
 Typically, linux uses URL's, i.e. smb://server/share   URL parsing is
 probably not in std.path's charter.
 
 However, I have used a command like:
 
 mount -t cifs //server/share /mnt/serverfiles
 
 But this is only in very special contexts.  In general I don't think
 //foo should be considered a server path on Posix systems.

 
 I actually got a request on the Phobos list that std.path should support
 such paths.  Furthermore, the POSIX stardard explicitly mentions "//"
 paths (though it basically says it is implementation-defined whether to
 bother dealing with them).
 
 
 joinPath:

 Does this normalize the paths?  For example:

 joinPath("/home/steves", "../lars") => /home/steves/../lars or
 /home/lars ?

 If so, the docs should reflect that.  If not, maybe it should :)  If
 it doesn't, at least the docs should state that it doesn't.

 No, it doesn't, and I don't think it should.  It is better to let the
 user choose whether they want the overhead of normalization by calling
 normalize() explicitly.  I will specify this in the docs.

 
 In fact, if you do not normalize during the join, it's *more* overhead
 to normalize afterwards.  If normalization is done while joining, then
 you only build one string.  There's no need to build a non-normalized
 string, then build a normalized string based on that.
 
 Plus the data is only iterated once.
 
 I think it's at least worth an option, but I'm not going to hold back
 my vote based on this :)

 
 If it doesn't turn out to be a huge undertaking, I think I'll replace
 joinPath() with a function buildPath() that takes an input range of path
 segments and joins them together, with optional normalization.  Then,
 normalize(path) can be implemented as:
 
     buildPath(pathSplitter(path));
 
 Does that sound sensible?

Actually, I realise now, it doesn't. :)  Since joinPath/buildPath needs 
to support path segments containing multiple directories, normalize would 
just be

  buildPath(path)

-Lars

Jul 20 2011

Rainer Schuetze <r.sagitario gmx.de> writes:

On 20.07.2011 19:36, Lars T. Kyllingstad wrote:
 On Mon, 18 Jul 2011 14:51:06 -0400, Steven Schveighoffer wrote:
 It's definitely something to think about.  At the very least, I think
 the default file system case sensitivity should be mapped to a certain
 function.  It doesn't hurt to expose the opposite sensitivity as an
 alternate (you need to implement both anyway).  A template with all
 options defaulted for the current OS makes good sense I think.
 Actually, expanding/renaming pathCharMatch provides a perfect way to
 default these:

 e.g.:
 version(Windows)
 {
      enum defaultOSSensitivity = false;
      enum defaultOSDirSeps = `\/`;
 }
 else version(Posix)
 {
      enum defaultOSSensitivity = true;
      enum defaultOSDirSeps = "/";
 }

 // replaces pathCharMatch
 int pathCharCmp(bool caseSensitive = defaultOSSensitivity, string
 dirseps = defaultOSDirSeps)(dchar a, dchar b);

 int fcmp(alias pred = "pathCharCmp(a, b)", S1, S2)(S1 filename1, S2
 filename2);

 Anyone who wants to do alternate comparisons is free to do so using
 other options from pathCharCmp.

 Good idea.  I'll probably implement something like that.

I like the direction that this is heading. If the idea gets extended to 
other functions as well, you won't have to reimplement std.path if you 
have to deal with posix paths on windows and vice versa, e.g. when 
transferring data containing paths between different systems.

Jul 21 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:j01trl$2ia$6 digitalmars.com...
 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:
 In driveName:

 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

 Yes, std.path is supposed to support UNC paths.  For instance, the
 following works now:

  assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar", "baz"]));

 I guess you would rather have that

  assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

 then?  I am not very familiar with Windows network shares; is \\foo never
 a valid path on its own?

I don't know whether or not it's "never" a valid path, but "dir \\server" 
always fails and "dir \\server\share" always works (assuming it exists, at 
least). So treating the whole thing as a drive might be the right thing to 
do. (Of course, it's completely moronic that WIndows works that way...)


 fcmp:
 "On Windows, fcmp is an alias for std.string.icmp, which yields a case
 insensitive comparison. On POSIX, it is an alias for std.algorithm.cmp,
 i.e. a case sensitive comparison."

 What about comparing c:/foo with c:\foo?  This isn't going to be equal
 with icmp.

 I am a bit unsure what to do about the comparison functions (fcmp,
 pathCharMatch and globMatch).  Aside from the issue with directory
 separators it is, as was pointed out by someone else, entirely possible
 to mount case-sensitive file systems on Windows and case-insensitive file
 systems on POSIX.  (The latter is not uncommon on OSX, I believe.)  I am
 open to suggestions.

If such mountings are possible, it would seem that there must be some way to 
check the sensitivity (otherwise the OS itself would probably crap out on 
it).

Although, at least in the case of case-insensitive mountings on posix, 
doesn't that mean such paths would have both case-sensitive and 
case-insensitive parts?

Ex: /mount/damnWinDrive/dir/subdir

Wouldn't the "mount/damnWinDrive" part be case-sensitive and the 
"dir/subdir" part be insensitve?

(I'm starting to really despise case-insensitive filesystems.)

Jul 19 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Here's some relevant info:
http://msdn.microsoft.com/en-us/library/aa365247%28v=vs.85%29.aspx

Jul 19 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 19 Jul 2011 15:55:29 -0400, Nick Sabalausky <a a.a> wrote:

 If such mountings are possible, it would seem that there must be some  
 way to
 check the sensitivity (otherwise the OS itself would probably crap out on
 it).

I've done it before, mounted a windows share on a linux box via cifs.

What happens is, everything thinks it's case sensitive (i.e. any  
user-space tools), but when you go to open a file, write a file, rename a  
file, the share performs as if it were case insensitive.

For example:

ls /mnt/winshare

File.txt

find /mnt/winshare -name FILE.TXT

No files found

touch /mnt/winshare/FILE.TXT => updates date/time on File.txt

cat /mnt/winshare/FILE.TXT => outputs File.txt

So as long as you are performing operations *blindly*, the case  
insensitivity kicks in.  For example, open a file without first searching  
for it.  But if you start reading directories, tools have no idea it's on  
a case-insensitive filesystem.

 Although, at least in the case of case-insensitive mountings on posix,
 doesn't that mean such paths would have both case-sensitive and
 case-insensitive parts?

 Ex: /mount/damnWinDrive/dir/subdir

 Wouldn't the "mount/damnWinDrive" part be case-sensitive and the
 "dir/subdir" part be insensitve?

Yes, actually, this is a very good point.  And there's no way for std.path  
to make that distinction.

 (I'm starting to really despise case-insensitive filesystems.)

I've never understood why they have any benefits whatsoever.  The only  
reason I can think of them having any use is legacy.

-Steve

Jul 20 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Tue, 19 Jul 2011 15:55:29 -0400, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:j01trl$2ia$6 digitalmars.com...
 On Mon, 18 Jul 2011 13:16:29 -0400, Steven Schveighoffer wrote:
 In driveName:

 Should std.path handle uunc paths?  i.e. \\servername\share\path  (I
 think if it does, it should specify \\servername\share as the drive)

 Yes, std.path is supposed to support UNC paths.  For instance, the
 following works now:

  assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo`, "bar",
  "baz"]));

 I guess you would rather have that

  assert (equal(pathSplitter(`\\foo\bar\baz`), [`\\foo\bar`, "baz"]));

 then?  I am not very familiar with Windows network shares; is \\foo
 never a valid path on its own?

 I don't know whether or not it's "never" a valid path, but "dir
 \\server" always fails and "dir \\server\share" always works (assuming
 it exists, at least). So treating the whole thing as a drive might be
 the right thing to do. (Of course, it's completely moronic that WIndows
 works that way...)
 
 
 fcmp:
 "On Windows, fcmp is an alias for std.string.icmp, which yields a case
 insensitive comparison. On POSIX, it is an alias for
 std.algorithm.cmp, i.e. a case sensitive comparison."

 What about comparing c:/foo with c:\foo?  This isn't going to be equal
 with icmp.

 I am a bit unsure what to do about the comparison functions (fcmp,
 pathCharMatch and globMatch).  Aside from the issue with directory
 separators it is, as was pointed out by someone else, entirely possible
 to mount case-sensitive file systems on Windows and case-insensitive
 file systems on POSIX.  (The latter is not uncommon on OSX, I believe.)
  I am open to suggestions.

 If such mountings are possible, it would seem that there must be some
 way to check the sensitivity (otherwise the OS itself would probably
 crap out on it).

That check would probably be orders of magnitude more expensive than a 
simple string operation.


 Although, at least in the case of case-insensitive mountings on posix,
 doesn't that mean such paths would have both case-sensitive and
 case-insensitive parts?
 
 Ex: /mount/damnWinDrive/dir/subdir
 
 Wouldn't the "mount/damnWinDrive" part be case-sensitive and the
 "dir/subdir" part be insensitve?

Argh.


 (I'm starting to really despise case-insensitive filesystems.)

Me too.

Does anyone know whether Windows' case insensitivity is limited to ASCII? 
If not, is the filesystem Unicode-aware, or does it uses some locale 
specific codepage to compare file names?

-Lars

Jul 20 2011

Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:

On 20/07/2011 20:57, Lars T. Kyllingstad wrote:
 Does anyone know whether Windows' case insensitivity is limited to ASCII?
 If not, is the filesystem Unicode-aware, or does it uses some locale
 specific codepage to compare file names?

 -Lars

Wikipedia says Windows long file names are up to 255 UTF-16 characters 
(or code points, depending which article you refer to >< ) Seems 
consistent with Microsoft's approach to character encoding throughout 
the rest of the Windows API.

 http://en.wikipedia.org/wiki/Long_filename

A...

Jul 20 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 20 Jul 2011 22:20:16 +0100, Alix Pexton wrote:

 On 20/07/2011 20:57, Lars T. Kyllingstad wrote:
 Does anyone know whether Windows' case insensitivity is limited to
 ASCII? If not, is the filesystem Unicode-aware, or does it uses some
 locale specific codepage to compare file names?

 -Lars

 
 Wikipedia says Windows long file names are up to 255 UTF-16 characters
 (or code points, depending which article you refer to >< ) Seems
 consistent with Microsoft's approach to character encoding throughout
 the rest of the Windows API.
 
 http://en.wikipedia.org/wiki/Long_filename


Thanks!  In other words, fcmp() needs to do UTF-16 decoding...

-Lars

Jul 21 2011

Rainer Schuetze <r.sagitario gmx.de> writes:

 Does anyone know whether Windows' case insensitivity is limited to ASCII?
 If not, is the filesystem Unicode-aware, or does it uses some locale
 specific codepage to compare file names?

I just tried a few examples: Using umlauts works as expected, i.e. upper 
or lower case characters are treated as the same. I then used the greek 
omega (\u3a9 and \u3c9), still files with upper and lower case are the 
same, even back on a FAT-16 usb drive (even though some ~-magic is going 
on there which might not work in Windows 3.1-).

 -Lars

Jul 20 2011

torhu <no spam.invalid> writes:

On 17.07.2011 23:27, Lars T. Kyllingstad wrote:
 - Should it be specified/documented whether a function returns "" or
 null?  Specifically, is it important that

      extension("foo") is null
      extension("foo.") !is null&&  extension("foo.") == ""

I guess you've already thought about this, but one solution is to just 
return the dot as part of the extension.  Then you get extension("foo.") 
== ".".  I noticed that .NET's getExtension method does this.

setExtension and defaultExtension would probably have to change to at 
least accept extensions that include the dot, if extension() is changed.

Jul 21 2011

D Programming

C/C++ Programming

Other

digitalmars.D - std.path review: update