digitalmars.D - Proposal for std.path replacement

Lars T. Kyllingstad (17/17) Mar 03 2011 As mentioned in the "std.path.getName(): Screwy by design?" thread, I

Jesse Phillips (2/11) Mar 03 2011 Well I'll vote yes. Behavior looks very clean.
Jerry Quinn (9/18) Mar 03 2011 Rather than:

Jonathan M Davis (12/31) Mar 03 2011 Those might not be bad functions to have, but they could get _really_ an...

Jonathan M Davis (38/60) Mar 03 2011 Some comments on names:

Jesse Phillips (2/7) Mar 03 2011 He did. Empty extension with .bashrc name.
Lars T. Kyllingstad (15/83) Mar 04 2011 We probably couldn't disagree more. :) I think fncharmatch is a horribl...

spir (12/19) Mar 04 2011 I agree with Jonathan about 'baseName' (2 words ==> camelcased, e basta!...
Jonathan M Davis (23/71) Mar 04 2011 I have no problem with finding better names than those. I was more sayin...
spir (22/30) Mar 04 2011 I tend to agree with you.

Nick Sabalausky (6/41) Mar 04 2011 Maybe it's just me having been knee-deep into the Win/MS-DOS world since...

Graham St Jack (8/25) Mar 03 2011 I like it. It certainly looks a lot cleaner than the current std.path.

Bekenn (23/25) Mar 03 2011 Please don't ever restrict encodings like that. As much as possible,

Jonathan M Davis (25/53) Mar 03 2011 It's not a bad thing for functions to be templatized on string type. How...
Graham St Jack (14/32) Mar 03 2011 Ok, I don't mind supporting wchar and dchar in addition to char,

Jonathan M Davis (7/43) Mar 03 2011 That's still what it means. scope in this context is _not_ deprecated. O...

Bekenn (2/5) Mar 03 2011 Oh, hey, I didn't know that. Even better. Thanks!

Bekenn (4/13) Mar 03 2011 Agreed; I think I might modify that slightly to "in" instead of "const",...
Lars T. Kyllingstad (21/57) Mar 04 2011 The problem is that the functions return slices of their input argument,...

spir (11/51) Mar 04 2011 IIUC, this means const should never be used on input parameters. Instead...

Lars T. Kyllingstad (6/59) Mar 04 2011 It should not be used if the function's return value is an alias of an

spir (9/38) Mar 04 2011 AFAIK not only 'in' is still const scope, but it precisely means what yo...
spir (11/53) Mar 04 2011 What about 'in' as default? I think a function changing its params is a ...

spir (11/28) Mar 03 2011 Looks very good. Including doc. A real pleasure to explore :-)

Lars T. Kyllingstad (4/34) Mar 04 2011 One more vote for dirName() has been noted. :)

Andrei Alexandrescu (5/6) Mar 04 2011 Meh. Since we have basename which is a replica of the homonym Unix

David Nadlinger (10/15) Mar 04 2011 I must admit that I don't quite remember the results of the previous

Jonathan M Davis (10/27) Mar 04 2011 The general consensus of the previous discussion was that we would stick...
spir (11/24) Mar 04 2011 Yes; if keep on adopting names that don't follow D's convention under th...

spir (11/28) Mar 03 2011 Looks very good. Including doc. A real pleasure to explore :-)
Nick Sabalausky (73/80) Mar 03 2011 I'm certainly all in favor of this (being the one that started the

Lars T. Kyllingstad (26/119) Mar 04 2011 See my comments to Jonathan's post.

Nick Sabalausky (12/35) Mar 04 2011 No, it's just an environment variable. In fact, it seems that % is a val...

Regan Heath (27/60) Mar 04 2011 Actually, you can. I just tried Textpad and Word 2010 and both accepted...

Nick Sabalausky (19/77) Mar 04 2011 Oh, you're right. It's the same for me on XP. I must have misread the do...

Jacob Carlborg (5/22) Mar 04 2011 How about functions for getting common directories like the home and

J Chapman (2/29) Mar 04 2011 They'd belong in a separate module - std.environment?

Jacob Carlborg (4/33) Mar 04 2011 Yeah, that might be a better idea.

Jonathan M Davis (14/65) Mar 04 2011 That's now how D works. That's not how it well ever work. Every time suc...
Nick Sabalausky (4/11) Mar 04 2011 I don't want to jinx it, but there seems to be a lot of agreement in thi...

Lars T. Kyllingstad (15/30) Mar 05 2011 Not too often, so I take it as a good sign that I'm onto something. ;)

Bekenn (9/9) Mar 05 2011 dirSeparator -- I'd actually prefer pathSeparator, but that's not on the...

Jim (3/11) Mar 05 2011 ++vote

Nick Sabalausky (24/51) Mar 05 2011 dirSep, But I'd be fine with the others too.

spir (7/9) Mar 05 2011 "currDirSymbol" not on the list ;-)

Nick Sabalausky (4/8) Mar 05 2011 I deliberately added it :) I think it's better than "curDirSymbol" (but...

Andrej Mitrovic (2/2) Mar 05 2011 Without even looking at any posts in this discussion, what is a
Jonathan M Davis (4/6) Mar 05 2011 currDirSym would be ".", and parentDirSym would "..". It's what you use ...
Andrej Mitrovic (7/13) Mar 05 2011 I dunno, maybe I'd prefer an enum.

Nick Sabalausky (4/11) Mar 05 2011 Windows has always had the '.' meaning "current directory". Even early

spir (10/20) Mar 06 2011 I agree with you and Jonathan about that point. Also find that 'dir' is...

spir (14/44) Mar 05 2011 currentDirSymbol
J Chapman (8/38) Mar 05 2011 currentDirSymbol

Andrej Mitrovic (3/3) Mar 05 2011 Please no repetitions in consonants, e.g. "curr". That's something
Jonathan M Davis (9/12) Mar 05 2011 LOL. Whereas I'd argue that there _should_ be a repetition in consonants...
Andrej Mitrovic (5/13) Mar 05 2011 Oh yeah, I forgot about DMD's semi-recent inclusion of "did you

Jonathan M Davis (20/52) Mar 05 2011 currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if

Lars T. Kyllingstad (5/58) Mar 06 2011 Interestingly, it seems drive names are actually restricted to one

Jonathan M Davis (8/67) Mar 06 2011 I could have sworn that I'd seen something which allowed you to assign t...
Bekenn (3/7) Mar 06 2011 Correct. However, the rules change for UNC paths:

Jonathan M Davis (3/13) Mar 06 2011 Now, _that_ is a great link. There's lots of good information there. Tha...
Nick Sabalausky (31/41) Mar 07 2011 Great link! I can't believe how much is in there that I never even had t...

Bekenn (10/15) Mar 07 2011 This right here is something that I think needs to be drilled into every...

Lars T. Kyllingstad (36/68) Mar 06 2011 In summary, it seems currentDirSymbol, baseName, dirName and driveName

Andrei Alexandrescu (16/26) Mar 06 2011 I think whatever you choose will not please everybody, so just choose

Lars T. Kyllingstad (11/38) Mar 06 2011 I don't agree. A suffix can be anything, and we already have functions
Jonathan M Davis (22/30) Mar 06 2011 I agree with Lars on this one. Everyone knows what an extension is. It's...

Andrei Alexandrescu (4/9) Mar 06 2011 We should have only one review at a time. That way each review will be

Jonathan M Davis (14/24) Mar 06 2011 In the general case, that seems like a good idea. I just don't want to g...

Andrei Alexandrescu (6/30) Mar 06 2011 Yah, thing is people work on stuff they care about, not the most urgent

Jonathan M Davis (25/29) Mar 06 2011 And it doesn't help that the people who may need a particular module are...

Nick Sabalausky (7/16) Mar 06 2011 I'm not sure I'd say the current std.path "basically works", but I get w...

Jonathan M Davis (12/32) Mar 06 2011 There's an enhancement request for it:

Michel Fortin (14/31) Mar 07 2011 I gave it a try even before assertPred was rejected to check
Nick Sabalausky (11/40) Mar 07 2011 Yea, that's what I figured, and that's why I was strongly in favor of

Jonathan M Davis (13/59) Mar 07 2011 Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so y...

Nick Sabalausky (5/80) Mar 07 2011 Thanks.

Nick Sabalausky (6/36) Mar 08 2011 I've added it and made an optional 'autoThrow' flag that, if set to fals...

spir (7/45) Mar 08 2011 I like it as well.

Nick Sabalausky (28/73) Mar 08 2011 If you do use it, and have autoThrow set to false, be aware that it does...

spir (26/57) Mar 07 2011 IIUC:

Jacob Carlborg (4/70) Mar 07 2011 String mixins ?

spir (7/79) Mar 07 2011 Works not, strings must be known at compile-time. And I don't want black...

spir (7/15) Mar 06 2011 eg: numerous compilers, programming editors,... ;-)

Jonathan M Davis (31/106) Mar 06 2011 This is a very small sampling of even the folks here on the newsgroup, l...

Regan Heath (11/11) Mar 07 2011 dirSep
Regan Heath (19/29) Mar 07 2011 Is it just me that feels dirName and getDirName are ambiguous?

Jonathan M Davis (3/13) Mar 05 2011 I have no idea what's used on Windows. I rarely use it these days.

Adam Ruppe (1/1) Mar 05 2011 current == "." on Windows too.

Rainer Schuetze (13/41) Mar 06 2011 Is this what everybody expects? I'm not sure, but another possibility

Jonathan M Davis (20/37) Mar 06 2011 How about
=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (16/33) Mar 06 2011 I would say:

spir (15/41) Mar 06 2011 This solves the issue of recomposing a file path/name from its parts. Bu...

Lars T. Kyllingstad (13/32) Mar 06 2011 I don't know about everybody, but it is what *NIX users expect, at

Jonathan M Davis (7/37) Mar 06 2011 I kind of like how your extension doesn't include the "." in it, since y...
Nick Sabalausky (10/21) Mar 06 2011 I initially felt somewhat uncomfortable with the idea of that behavior, ...
spir (9/40) Mar 06 2011 What about extending the notion of 'device' (see other post) to cover 'h...

Lars T. Kyllingstad (5/52) Mar 06 2011 I don't think std.path should handle general URIs. It should only have

Nick Sabalausky (9/17) Mar 06 2011 If std.path doesn't handle uri's, then we'd need a whole other set of

Jonathan M Davis (3/24) Mar 06 2011 We do have std.uri, though it's pretty bare-boned at the moment.
spir (17/36) Mar 06 2011 Right, but if there is reasonable probability for such an extension, the...
Lars T. Kyllingstad (13/32) Mar 07 2011 I am now certain that std.path should not give URIs any kind of special

Jim (3/40) Mar 07 2011 Not quite sure it would be that easy.
Nick Sabalausky (11/42) Mar 07 2011 I really wish that wasn't such a good argument. I'm now convinced too,

Rainer Schuetze (4/14) Mar 06 2011 Maybe special casing similar to the "hidden" files starting with '.':

spir (9/23) Mar 06 2011 I agrre, and this is probably the correct solution: if there is nothing ...

spir (25/38) Mar 06 2011 Depends. We must make clear whether such funcs work:

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= (16/40) Mar 06 2011 This does not make sense because there is no way to tell whether

Nick Sabalausky (38/47) Mar 06 2011 But it doesn't have simple consequences. If I'm trying to refer to a

Vladimir Panteleev (11/13) Mar 06 2011 It's possible to create files and directories with one trailing dot on

Andrej Mitrovic (5/6) Mar 06 2011 Although for some reason Explorer never lets you do that. Well, I have
Nick Sabalausky (3/13) Mar 06 2011 It ain't valid when optlink creates it ;)

Regan Heath (18/24) Mar 07 2011 ?? I would expect:

Lars T. Kyllingstad (17/52) Mar 07 2011 I don't think it does, or rather, I don't think there is such a thing as...

spir (9/25) Mar 07 2011 After some more thought, I think you are right on this point. Precisely ...
Nick Sabalausky (6/41) Mar 07 2011 That's true on windows too:

Jonathan M Davis (9/31) Mar 06 2011 I hate to be nitpicky, but I notice that you're the only author listed f...

Lars T. Kyllingstad (10/42) Mar 06 2011 Everything you see in that module is completely rewritten from scratch. ...

Jonathan M Davis (6/48) Mar 06 2011 That makes sense. It's just that if you didn't rewrite it from scratch, ...

Jim (2/64) Mar 06 2011 Drive names in AmigaOS are longer by default iirc. Anyway, Microsoft mig...
Jonathan M Davis (14/28) Mar 06 2011 The one that really bit me IIRC was Audacious. I had some newly ripped m...

Nick Sabalausky (10/17) Mar 06 2011 I'm no unix expert, but my understanding is that mime types in the

Jonathan M Davis (6/25) Mar 06 2011 I thought that the first few bytes of the file _were_ the mime type. Cer...

Christopher Nicholson-Sauls (16/43) Mar 06 2011 As someone who uses hex editors quite a bit (resorting these days to

Jonathan M Davis (8/56) Mar 06 2011 I've never studied mime types, so I don't know much about them. It's jus...

Johannes Pfau (25/85) Mar 07 2011 The mime type can be saved as meta data on some filesystems, but it's

spir (12/96) Mar 07 2011 I would definitely love an inter-OS standard for storing the MIME-type i...

Adam D. Ruppe (8/10) Mar 07 2011 A better solution would be to store it in the filename. Might

Regan Heath (5/15) Mar 07 2011 :P
Nick Sabalausky (3/13) Mar 07 2011 I agree, and have to say: Very well put :)
Bekenn (3/10) Mar 07 2011 Along those same lines:

Lutger Blijdestijn (5/64) Mar 07 2011 A good place to start is likely freedesktop.org, which maintains

Andrej Mitrovic (4/4) Mar 20 2011 I've just reported two issues with std.path.join:

Nick Sabalausky (5/9) Mar 20 2011 Ugh, phobos has a real problem with ctfe. There's a lot that doesn't wor...

Jonathan M Davis (18/27) Mar 20 2011 Probably because CTFE is a bit of a black art with regards to what works...

Lars T. Kyllingstad (4/9) Mar 20 2011 Are you referring to joinPath() in my code? If so, no. It works at

Andrej Mitrovic (4/13) Mar 20 2011 Fantastic. What about issue 5759, can it work properly so:
Andrej Mitrovic (3/20) Mar 20 2011 Sorry, I'm stupid and didn't read the entirety of your post. It does

Nick Sabalausky (17/46) Mar 07 2011 No, MIME is a text-based filetype-naming system thst originated from SMT...

Christopher Nicholson-Sauls (10/32) Mar 06 2011 One could likely get a good grip of the "black box" by studying the

spir (7/34) Mar 06 2011 I'd say: MIME types are another wild goose chase field ;-)
Bruno Medeiros (39/56) Apr 06 2011 I hope I'm not too late for the party, especially because I do have a

Lars T. Kyllingstad (17/72) Apr 07 2011 Not at all. Reviews of, and further work on, std.path has been put on

Jonathan M Davis (6/82) Apr 07 2011 And on some file systems, even / is valid! Though it's not worth it to t...

Lars T. Kyllingstad (9/15) Apr 07 2011 Which filesystems are those? The POSIX:2008 specification specifically

Bruno Medeiros (7/17) Apr 13 2011 Yeah, that's a good point. I'm sure yet if there is a good way that

Jonathan M Davis (18/35) Apr 07 2011 I didn't know that Posix had anything to say on the matter (though it do...

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
started working on a rewrite of std.path a long time ago, but I got 
sidetracked by other things.  The recent discussion got me working on it 
again, and it turned out there wasn't that much left to be done.

So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

Features:

- Most functions work with all string types, i.e. all permutations of 
mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are 
toAbsolute() and toCanonical, because they rely on std.file.getcwd() 
which returns an immutable(char)[].

- Correct behaviour in corner cases that aren't covered by the current 
std.path.  See the other thread for some examples, or take a look at the 
unittests for a more complete picture.

- Saner naming scheme.  (Still not set in stone, of course.)

-Lars

Mar 03 2011

Jesse Phillips <jessekphillips+D gmail.com> writes:

Lars T. Kyllingstad Wrote:

 As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
 started working on a rewrite of std.path a long time ago, but I got 
 sidetracked by other things.  The recent discussion got me working on it 
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

Well I'll vote yes. Behavior looks very clean.

Mar 03 2011

Jerry Quinn <jlquinn optonline.net> writes:

Lars T. Kyllingstad Wrote:

 As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
 started working on a rewrite of std.path a long time ago, but I got 
 sidetracked by other things.  The recent discussion got me working on it 
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

Rather than:

drive() & stripDrive()
extension() & stripExtension()

would it make sense to combine them?

string[2] splitDrive(path)
string[2] splitExtension(path)

Just a thought.

Jerry

Mar 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, March 03, 2011 10:31:20 Jerry Quinn wrote:
 Lars T. Kyllingstad Wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 Rather than:
 
 drive() & stripDrive()
 extension() & stripExtension()
 
 would it make sense to combine them?
 
 string[2] splitDrive(path)
 string[2] splitExtension(path)

Those might not be bad functions to have, but they could get _really_ annoying 
if they were there _instead_ of the split functions. I, for one, am very likely 
to calling functions like stripExtension and passing the result directly into 
another function or using it directly in an expression. Having to throw a [0]
or 
[1] on the end of all those calls would be ugly and error-prone (since it would 
be really easy to use the wrong index) - not to mention, it would be less 
efficient, which could matter in cases where you have to process a lot of file 
names.

So, they might not be bad functions to add, but I certainly wouldn't want to
see 
them replace the strip functions.

- Jonathan M Davis

Mar 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, March 03, 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)

Some comments on names:

1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are
probably 
okay, but basename should definitely be baseName.

2. Please shorten the Separator in the names to Sep (i.e. dirSep, pathSep, and 
isDirSep). They're just as clear that way and less annoyingly long. Similarly, 
I'd rename currentDirSymbol to currDirSymbol - or maybe even have currDirSym
and 
parentDirSym.

3. I'd prefer dirName to directory. It's shorter, closer to what was there 
before, and a better name IMHO. directory makes me wonder if it's checking 
whether the name is a directory or not (which is what std.file.isDir does).

4. It might be better to short extension/Extension to ext/Ext, but that works 
better with functions like stripExtension (stripExt), then it would extension 
(ext) by itself, and if we wanted complete consistency, then changing Extension 
to Ext would mean changing extension to ext, which wouldn't really be
desirable. 
I'd still be very tempted to rename the xExtension functions to xExt though. 
Extension is unnecessarily long.

5. setExtension might be better as replaceExtension, since set tends to imply 
that you're doing the change to the passed in string rather than just returning 
a new string with the changes.

6. I'd strongly suggest making most of the functions properties (though that 
would require changing the examples). Functions which are nouns (such as drive 
or extension) really should be properties, otherwise they shouldn't have names 
which are nouns. basename/baseName is a funny one though since it's a noun and 
really should be a property, but it does have a version which takes an extra 
parameter, so I'm not sure what to do with that one. Unfortunately, for some 
reason, at the moment you can't overload property function with a non-property 
function (I keep meaning to open an enhancement request on that).

As far as examples go, assuming that you made it so that .bashrc is a file with
a 
base name of .bashrc and no extension rather than a file with no base name and
an 
extension of bashrc (I haven't looked at the implementation at all yet, so I 
don't know what you did with that), then you really should put it (or an
example 
like it) in the examples.

At a first glance, it looks good overall, but I really think that the noun 
functions should become properties or have their names changed and that some of 
the names really should be shortened. We want properly descriptive names, but
it 
doesn't take that much for a longer name to get irritating.

- Jonathan M Davis

Mar 03 2011

Jesse Phillips <jessekphillips+D gmail.com> writes:

Jonathan M Davis Wrote:

 As far as examples go, assuming that you made it so that .bashrc is a file
with a 
 base name of .bashrc and no extension rather than a file with no base name and
an 
 extension of bashrc (I haven't looked at the implementation at all yet, so I 
 don't know what you did with that), then you really should put it (or an
example 
 like it) in the examples.

He did. Empty extension with .bashrc name.

Mar 03 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 03 Mar 2011 10:39:38 -0800, Jonathan M Davis wrote:

 On Thursday, March 03, 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)

 
 Some comments on names:
 
 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are
 probably okay, but basename should definitely be baseName.

We probably couldn't disagree more. :)  I think fncharmatch is a horrible 
name.  On the other hand, basename() is named after the 'basename' UNIX 
utility, and doesn't mean anything on its own.  At least, I've never 
heard of such a thing as the "base name" of a file, but please prove me 
wrong.


 2. Please shorten the Separator in the names to Sep (i.e. dirSep,
 pathSep, and isDirSep). They're just as clear that way and less
 annoyingly long. Similarly, I'd rename currentDirSymbol to currDirSymbol
 - or maybe even have currDirSym and parentDirSym.

 3. I'd prefer dirName to directory. It's shorter, closer to what was
 there before, and a better name IMHO. directory makes me wonder if it's
 checking whether the name is a directory or not (which is what
 std.file.isDir does).
 
 4. It might be better to short extension/Extension to ext/Ext, but that
 works better with functions like stripExtension (stripExt), then it
 would extension (ext) by itself, and if we wanted complete consistency,
 then changing Extension to Ext would mean changing extension to ext,
 which wouldn't really be desirable. I'd still be very tempted to rename
 the xExtension functions to xExt though. Extension is unnecessarily
 long.

I have a preference for the longer names, but not a very strong one.  I'm 
not going to oppose the changes if others agree with you.

 
 5. setExtension might be better as replaceExtension, since set tends to
 imply that you're doing the change to the passed in string rather than
 just returning a new string with the changes.

Good point.


 6. I'd strongly suggest making most of the functions properties (though
 that would require changing the examples). Functions which are nouns
 (such as drive or extension) really should be properties, otherwise they
 shouldn't have names which are nouns. basename/baseName is a funny one
 though since it's a noun and really should be a property, but it does
 have a version which takes an extra parameter, so I'm not sure what to
 do with that one. Unfortunately, for some reason, at the moment you
 can't overload property function with a non-property function (I keep
 meaning to open an enhancement request on that).

Also a good point.  Not only that, most functions should be pure  safe 
nothrow, but I've completely forgotten to add these annotations!


 As far as examples go, assuming that you made it so that .bashrc is a
 file with a base name of .bashrc and no extension rather than a file
 with no base name and an extension of bashrc (I haven't looked at the
 implementation at all yet, so I don't know what you did with that), then
 you really should put it (or an example like it) in the examples.

It's in the examples for extension() and stripExtension().


 At a first glance, it looks good overall, but I really think that the
 noun functions should become properties or have their names changed and
 that some of the names really should be shortened. We want properly
 descriptive names, but it doesn't take that much for a longer name to
 get irritating.
 
 - Jonathan M Davis

Thanks for the feedback!

-Lars

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/04/2011 09:33 AM, Lars T. Kyllingstad wrote:
 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are
  probably okay, but basename should definitely be baseName.


 We probably couldn't disagree more. :)  I think fncharmatch is a horrible
 name.  On the other hand, basename() is named after the 'basename' UNIX
 utility, and doesn't mean anything on its own.  At least, I've never
 heard of such a thing as the "base name" of a file, but please prove me
 wrong.

I agree with Jonathan about 'baseName' (2 words ==> camelcased, e basta!) (*). 
And indeed names like 'fcmp' or 'fnccharmtach' should not even be allowed to 
live ;-).

Denis

(*) Except when the first part is a prefix / preposition, like input, output, 
subtype, supertype, transcode... in which case the name is or would be a single 
word in regular English.
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 04 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 04 March 2011 00:33:58 Lars T. Kyllingstad wrote:
 On Thu, 03 Mar 2011 10:39:38 -0800, Jonathan M Davis wrote:
 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are
 probably okay, but basename should definitely be baseName.

 
 We probably couldn't disagree more. :)  I think fncharmatch is a horrible
 name.  On the other hand, basename() is named after the 'basename' UNIX
 utility, and doesn't mean anything on its own.  At least, I've never
 heard of such a thing as the "base name" of a file, but please prove me
 wrong.

I have no problem with finding better names than those. I was more saying that 
the names that they have shouldn't be camelcased. They'd be absolutely hideous 
if they were.

As for basename, I'd argue baseName because it's properly camelcased that way 
(and therefore follows Phobos' general naming conventions). I don't find the
fact 
that there's a unix utility by that name to be particularly relevant to the 
casing of the name. But regardless, the concept of a file's "base name" is
quite 
clear, even if you've never heard of the unix utility efore.

 2. Please shorten the Separator in the names to Sep (i.e. dirSep,
 pathSep, and isDirSep). They're just as clear that way and less
 annoyingly long. Similarly, I'd rename currentDirSymbol to currDirSymbol
 - or maybe even have currDirSym and parentDirSym.
 
 3. I'd prefer dirName to directory. It's shorter, closer to what was
 there before, and a better name IMHO. directory makes me wonder if it's
 checking whether the name is a directory or not (which is what
 std.file.isDir does).
 
 4. It might be better to short extension/Extension to ext/Ext, but that
 works better with functions like stripExtension (stripExt), then it
 would extension (ext) by itself, and if we wanted complete consistency,
 then changing Extension to Ext would mean changing extension to ext,
 which wouldn't really be desirable. I'd still be very tempted to rename
 the xExtension functions to xExt though. Extension is unnecessarily
 long.

 
 I have a preference for the longer names, but not a very strong one.  I'm
 not going to oppose the changes if others agree with you.

I definitely like descriptive names, and my function names are often long, but
I 
do tend to find that long names can get annoying - especially if you have to
use 
them often. So, I think that you should generally choose shorter names as long 
as they are appropriately descriptive. A name like stripExt is clear enough - 
especially in context - to work quite well, so the longer name stripExtension
is 
unnecessary, whereas ext may not be clear enough and the full name extension 
would probably be better.

 6. I'd strongly suggest making most of the functions properties (though
 that would require changing the examples). Functions which are nouns
 (such as drive or extension) really should be properties, otherwise they
 shouldn't have names which are nouns. basename/baseName is a funny one
 though since it's a noun and really should be a property, but it does
 have a version which takes an extra parameter, so I'm not sure what to
 do with that one. Unfortunately, for some reason, at the moment you
 can't overload property function with a non-property function (I keep
 meaning to open an enhancement request on that).

 
 Also a good point.  Not only that, most functions should be pure  safe
 nothrow, but I've completely forgotten to add these annotations!

Indeed. I should have noticed and mentioned the lack of pure and nothrow as 
well. I haven't generally messed with  safe yet though, since so many critical 
functions in Phobos aren't  safe yet, so you can't really make much  safe. If 
you can though, it's definitely desirable.

 As far as examples go, assuming that you made it so that .bashrc is a
 file with a base name of .bashrc and no extension rather than a file
 with no base name and an extension of bashrc (I haven't looked at the
 implementation at all yet, so I don't know what you did with that), then
 you really should put it (or an example like it) in the examples.

 
 It's in the examples for extension() and stripExtension().

Ah, so it is. I missed it. I was looking for something real like .bashrc rather 
than .file, and I glanced over it too quickly to notice it.

- Jonathan M Davis

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/04/2011 12:01 PM, Jonathan M Davis wrote:
 I have a preference for the longer names, but not a very strong one.  I'm
  not going to oppose the changes if others agree with you.


 I definitely like descriptive names, and my function names are often long, but
I
 do tend to find that long names can get annoying - especially if you have to
use
 them often. So, I think that you should generally choose shorter names as long
 as they are appropriately descriptive. A name like stripExt is clear enough -
 especially in context - to work quite well, so the longer name stripExtension
is
 unnecessary, whereas ext may not be clear enough and the full name extension

I tend to agree with you.
Especially on the point that (very) common names can be shorter. On one hand, 
they are more easily inderstood & memorised precisely because they are common; 
on the other, you get the maximum benefit in terms of user-friendliness for the 
same reason (that they are common). Abbreviating more rare names makes the code 
harder to understand for (very) few benefit.
Now, is stripExt/stripExtension that common? I would say no. The day you need 
it, you may have to write it several times because you're dealing with a piece 
of code that copes with file names. Right, then, you may like it be shorter. 
But this "pain" will soon stop; and maybe, probably?, you won't have again to 
write that name for weeks or months. What do you think?

Another factor is the inherent clarity of the abbreviation. 'ext' can certainly 
be interpreted in various ways. As you say, context helps much; but it's a 
decisive argument for languages in which context prefixes, such as module 
names, are commonly used: eg "path.stripExt(fileName)". But this is not common 
practice in D, thus func names need be more precise, I guess.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 04 2011

"Nick Sabalausky" <a a.a> writes:

"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2175.1299248868.4748.digitalmars-d puremagic.com...
 On 03/04/2011 12:01 PM, Jonathan M Davis wrote:
 I have a preference for the longer names, but not a very strong one. 
 I'm
  not going to oppose the changes if others agree with you.


 I definitely like descriptive names, and my function names are often 
 long, but I
 do tend to find that long names can get annoying - especially if you have 
 to use
 them often. So, I think that you should generally choose shorter names as 
 long
 as they are appropriately descriptive. A name like stripExt is clear 
 enough -
 especially in context - to work quite well, so the longer name 
 stripExtension is
 unnecessary, whereas ext may not be clear enough and the full name 
 extension

 I tend to agree with you.
 Especially on the point that (very) common names can be shorter. On one 
 hand, they are more easily inderstood & memorised precisely because they 
 are common; on the other, you get the maximum benefit in terms of 
 user-friendliness for the same reason (that they are common). Abbreviating 
 more rare names makes the code harder to understand for (very) few 
 benefit.
 Now, is stripExt/stripExtension that common? I would say no. The day you 
 need it, you may have to write it several times because you're dealing 
 with a piece of code that copes with file names. Right, then, you may like 
 it be shorter. But this "pain" will soon stop; and maybe, probably?, you 
 won't have again to write that name for weeks or months. What do you 
 think?

 Another factor is the inherent clarity of the abbreviation. 'ext' can 
 certainly be interpreted in various ways. As you say, context helps much; 
 but it's a decisive argument for languages in which context prefixes, such 
 as module names, are commonly used: eg "path.stripExt(fileName)". But this 
 is not common practice in D, thus func names need be more precise, I 
 guess.

Maybe it's just me having been knee-deep into the Win/MS-DOS world since 
well into the 8.3 days, but "ext" always instinctively means "file 
extension" to me. Of course, like I said, I happy with "extension" too, but 
just FWIW.

Mar 04 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 04/03/11 02:59, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

I like it. It certainly looks a lot cleaner than the current std.path.

I am interested in why you chose to use templates to allow not only 
char, dchar and wchar arrays, but also const, mutable, and immutable. My 
first instinct would be to use non-templated functions that take const 
char[].

-- 
Graham St Jack

Mar 03 2011

Bekenn <leaveme alone.com> writes:

On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take const
 char[].

Please don't ever restrict encodings like that.  As much as possible, 
libraries should seek to be encoding agnostic (though I'm all for 
const-qualifying parameters).  This is one area where I feel the 
standard library severely lacks at present.

As a Windows developer, I prefer to use wchar strings by default and use 
only the W versions of the Windows API functions, because the A versions 
severely limit functionality.  Only the W versions have full support for 
Unicode; the A versions are entirely dependent on the current (8-bit) 
code page.  This means no support for UNC paths or paths longer than 260 
characters, and also means that international characters commonly end up 
completely garbled.  Good practice in Windows is to consider the A 
versions deprecated and avoid them like the plague.

References:
	http://msdn.microsoft.com/en-us/library/dd317752%28v=VS.85%29.aspx
	http://blogs.msdn.com/b/michkap/archive/2006/10/24/867880.aspx
	http://blogs.msdn.com/b/michkap/archive/2006/08/22/707665.aspx
	http://blogs.msdn.com/b/michkap/archive/2007/05/07/2464778.aspx

When I first started looking at D, I compiled the win32 example on the D 
web page.  I noticed it used MessageBoxA, so I changed that to 
MessageBoxW.  That generated an error, because nobody had bothered to 
add a MessageBoxW declaration.  That was the very last time I used 
std.c.windows.

Mar 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 03 March 2011 18:04:11 Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take const
 char[].

 
 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.

It's not a bad thing for functions to be templatized on string type. However, I 
would point out that it's fairly common practice to just use string everywhere 
except where you need a string to be a random-access range - in which case you 
used dstring - or where you need to pass a string to a Windows system function, 
in which case you convert it to a wstring or wchar[]. The need to use wstring 
when using Phobos is practically non-existent. Now, if you're frequently
calling 
Windows system functions directly, then wstring and wchar[] would be used far 
more frequently, but don't expect Phobos to be using wstring much.

It's likely that more of Phobos will become string-type-agnostic and templatize 
on string type, but there may be functions which aren't simply due to the 
increased complexity or because no one gets around to it with everything else 
that needs doing. The normal string type to use is string, so that's generally 
what is designed for.

 As a Windows developer, I prefer to use wchar strings by default and use
 only the W versions of the Windows API functions, because the A versions
 severely limit functionality.  Only the W versions have full support for
 Unicode; the A versions are entirely dependent on the current (8-bit)
 code page.  This means no support for UNC paths or paths longer than 260
 characters, and also means that international characters commonly end up
 completely garbled.  Good practice in Windows is to consider the A
 versions deprecated and avoid them like the plague.
 
 References:
 	http://msdn.microsoft.com/en-us/library/dd317752%28v=VS.85%29.aspx
 	http://blogs.msdn.com/b/michkap/archive/2006/10/24/867880.aspx
 	http://blogs.msdn.com/b/michkap/archive/2006/08/22/707665.aspx
 	http://blogs.msdn.com/b/michkap/archive/2007/05/07/2464778.aspx
 
 When I first started looking at D, I compiled the win32 example on the D
 web page.  I noticed it used MessageBoxA, so I changed that to
 MessageBoxW.  That generated an error, because nobody had bothered to
 add a MessageBoxW declaration.  That was the very last time I used
 std.c.windows.

If there are key functions like that that you expect to be druntime (in this 
case, it would be core.sys.windows.windows - I believe that std.c is
deprecated, 
or at least it's not intended to be used anymore; OS-specific functions like
that 
go in core), then open up enhancement requests for them, or they're unlikely to 
be added. I don't believe that anyone is going through the entirety of the 
Windows API (or the Posix API for that matter) and adding all those functions
to 
druntime. They generally get added as required by Phobos or other stuff in 
druntime or because someone requests it. Not to mention, you can always use 
github to make the appropriate changes to druntime and create pull request.
That 
would likely speed up such improvements.

- Jonathan M Davis

Mar 03 2011

Graham St Jack <Graham.StJack internode.on.net> writes:

On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take 
 const
 char[].

 Please don't ever restrict encodings like that.  As much as possible, 
 libraries should seek to be encoding agnostic (though I'm all for 
 const-qualifying parameters).  This is one area where I feel the 
 standard library severely lacks at present.

 As a Windows developer, I prefer to use wchar strings by default and 
 use only the W versions of the Windows API functions, because the A 
 versions severely limit functionality.  Only the W versions have full 
 support for Unicode; the A versions are entirely dependent on the 
 current (8-bit) code page.  This means no support for UNC paths or 
 paths longer than 260 characters, and also means that international 
 characters commonly end up completely garbled.  Good practice in 
 Windows is to consider the A versions deprecated and avoid them like 
 the plague.

Ok, I don't mind supporting wchar and dchar in addition to char, 
especially if Windows insists on using them.

My main issue here is with the constness of the parameters. I think the 
correct parameter to pass is const C[]. This has the advantages of:
* Accepting both mutable and immutable data.
* Declares that the function won't mutate the data.
* Declares that the function doesn't expect the data to be immutable.

It would be even better to use const scope char[], declaring that a 
reference won't be kept, but it seems that scope in this context is 
deprecated.

Once upon a time "in" meant const scope. Does anyone know what it means now?

-- 
Graham St Jack

Mar 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 03 March 2011 19:23:33 Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].

 
 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.
 
 As a Windows developer, I prefer to use wchar strings by default and
 use only the W versions of the Windows API functions, because the A
 versions severely limit functionality.  Only the W versions have full
 support for Unicode; the A versions are entirely dependent on the
 current (8-bit) code page.  This means no support for UNC paths or
 paths longer than 260 characters, and also means that international
 characters commonly end up completely garbled.  Good practice in
 Windows is to consider the A versions deprecated and avoid them like
 the plague.

 
 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.
 
 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of:
 * Accepting both mutable and immutable data.
 * Declares that the function won't mutate the data.
 * Declares that the function doesn't expect the data to be immutable.
 
 It would be even better to use const scope char[], declaring that a
 reference won't be kept, but it seems that scope in this context is
 deprecated.
 
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?

That's still what it means. scope in this context is _not_ deprecated. Only 
scoped local variables (not scoped parameters or scope statements) are 
deprecated. in would be the correct thing to use. It's used elswhere with 
strings is Phobos. And yes, as long as the strings being passed in are not
being 
mutated, then having the parameters be in is the correct thing to do.

- Jonathan M Davis

Mar 03 2011

Bekenn <leaveme alone.com> writes:

On 3/3/2011 10:17 PM, Jonathan M Davis wrote:
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?

 That's still what it means. scope in this context is _not_ deprecated.

Oh, hey, I didn't know that.  Even better.  Thanks!

Mar 03 2011

Bekenn <leaveme alone.com> writes:

On 3/3/2011 7:23 PM, Graham St Jack wrote:
 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.

 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of:
 * Accepting both mutable and immutable data.
 * Declares that the function won't mutate the data.
 * Declares that the function doesn't expect the data to be immutable.

Agreed; I think I might modify that slightly to "in" instead of "const", 
but it means the exact same thing.

 Once upon a time "in" meant const scope. Does anyone know what it means
 now?

"in" is a synonym for a non-parenthesized const.

Mar 03 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 04 Mar 2011 13:53:33 +1030, Graham St Jack wrote:

 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].

 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.

 As a Windows developer, I prefer to use wchar strings by default and
 use only the W versions of the Windows API functions, because the A
 versions severely limit functionality.  Only the W versions have full
 support for Unicode; the A versions are entirely dependent on the
 current (8-bit) code page.  This means no support for UNC paths or
 paths longer than 260 characters, and also means that international
 characters commonly end up completely garbled.  Good practice in
 Windows is to consider the A versions deprecated and avoid them like
 the plague.

 
 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.
 
 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of: *
 Accepting both mutable and immutable data. * Declares that the function
 won't mutate the data. * Declares that the function doesn't expect the
 data to be immutable.

The problem is that the functions return slices of their input argument, 
which means that the constancy of the input argument gets transferred to 
the return value.  Here's an example to illustrate:

    C[] first(C)(const C[] s) { return s[0 .. 1]; }

    char[] a = "hello".dup;
    auto b = first(a);

Try to compile this, and you get the error message

    Error: cannot implicitly convert expression (s[0u..1u])
    of type const(char[]) to char[]

The correct thing to do is to use inout, like this:

    inout(C)[] basename(C)(inout(C)[] path) { ... }

I templated the functions on character type rather than string type 
exactly because I plan to do this.  Unfortunately, it's not possible at 
the moment, because inout isn't properly implemented yet:

    http://d.puremagic.com/issues/show_bug.cgi?id=3748

Note that the functions which do not return slices of their input, such 
as setExtension(), joinPath(), etc., all have input parameters that are 
properly marked with 'in'.

 It would be even better to use const scope char[], declaring that a
 reference won't be kept, but it seems that scope in this context is
 deprecated.
 
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?

Still does.

-Lars

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/04/2011 09:15 AM, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 13:53:33 +1030, Graham St Jack wrote:

 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].

 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.

 As a Windows developer, I prefer to use wchar strings by default and
 use only the W versions of the Windows API functions, because the A
 versions severely limit functionality.  Only the W versions have full
 support for Unicode; the A versions are entirely dependent on the
 current (8-bit) code page.  This means no support for UNC paths or
 paths longer than 260 characters, and also means that international
 characters commonly end up completely garbled.  Good practice in
 Windows is to consider the A versions deprecated and avoid them like
 the plague.

 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.

 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of: *
 Accepting both mutable and immutable data. * Declares that the function
 won't mutate the data. * Declares that the function doesn't expect the
 data to be immutable.

 The problem is that the functions return slices of their input argument,
 which means that the constancy of the input argument gets transferred to
 the return value.  Here's an example to illustrate:

      C[] first(C)(const C[] s) { return s[0 .. 1]; }

      char[] a = "hello".dup;
      auto b = first(a);

 Try to compile this, and you get the error message

      Error: cannot implicitly convert expression (s[0u..1u])
      of type const(char[]) to char[]

IIUC, this means const should never be used on input parameters. Instead of 
meaning what the func will (not) do with its param(s), it imposes undue 
requirements on the outside world. Or do I miss something? From my point of 
view, qualifiers inside a function's interface should only describe the 
function behaviour.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 04 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 04 Mar 2011 09:33:04 +0100, spir wrote:

 On 03/04/2011 09:15 AM, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 13:53:33 +1030, Graham St Jack wrote:

 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].

 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.

 As a Windows developer, I prefer to use wchar strings by default and
 use only the W versions of the Windows API functions, because the A
 versions severely limit functionality.  Only the W versions have full
 support for Unicode; the A versions are entirely dependent on the
 current (8-bit) code page.  This means no support for UNC paths or
 paths longer than 260 characters, and also means that international
 characters commonly end up completely garbled.  Good practice in
 Windows is to consider the A versions deprecated and avoid them like
 the plague.

 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.

 My main issue here is with the constness of the parameters. I think
 the correct parameter to pass is const C[]. This has the advantages
 of: * Accepting both mutable and immutable data. * Declares that the
 function won't mutate the data. * Declares that the function doesn't
 expect the data to be immutable.

 The problem is that the functions return slices of their input
 argument, which means that the constancy of the input argument gets
 transferred to the return value.  Here's an example to illustrate:

      C[] first(C)(const C[] s) { return s[0 .. 1]; }

      char[] a = "hello".dup;
      auto b = first(a);

 Try to compile this, and you get the error message

      Error: cannot implicitly convert expression (s[0u..1u]) of type
      const(char[]) to char[]

 
 IIUC, this means const should never be used on input parameters. Instead
 of meaning what the func will (not) do with its param(s), it imposes
 undue requirements on the outside world. Or do I miss something? From my
 point of view, qualifiers inside a function's interface should only
 describe the function behaviour.

It should not be used if the function's return value is an alias of an 
input parameter.  That's what inout is for.  In all other cases, const is 
fine.

  http://www.digitalmars.com/d/2.0/function.html#inout-functions

-Lars

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/04/2011 04:23 AM, Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take const
 char[].

 Please don't ever restrict encodings like that. As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters). This is one area where I feel the standard
 library severely lacks at present.

 As a Windows developer, I prefer to use wchar strings by default and use only
 the W versions of the Windows API functions, because the A versions severely
 limit functionality. Only the W versions have full support for Unicode; the A
 versions are entirely dependent on the current (8-bit) code page. This means
 no support for UNC paths or paths longer than 260 characters, and also means
 that international characters commonly end up completely garbled. Good
 practice in Windows is to consider the A versions deprecated and avoid them
 like the plague.

 Ok, I don't mind supporting wchar and dchar in addition to char, especially if
 Windows insists on using them.

 My main issue here is with the constness of the parameters. I think the correct
 parameter to pass is const C[]. This has the advantages of:
 * Accepting both mutable and immutable data.
 * Declares that the function won't mutate the data.
 * Declares that the function doesn't expect the data to be immutable.

 It would be even better to use const scope char[], declaring that a reference
 won't be kept, but it seems that scope in this context is deprecated.

 Once upon a time "in" meant const scope. Does anyone know what it means now?

AFAIK not only 'in' is still const scope, but it precisely means what your 
param is: plain input.
(I would love params to be 'ini' by default.)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/04/2011 07:17 AM, Jonathan M Davis wrote:
 On Thursday 03 March 2011 19:23:33 Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].

 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.

 As a Windows developer, I prefer to use wchar strings by default and
 use only the W versions of the Windows API functions, because the A
 versions severely limit functionality.  Only the W versions have full
 support for Unicode; the A versions are entirely dependent on the
 current (8-bit) code page.  This means no support for UNC paths or
 paths longer than 260 characters, and also means that international
 characters commonly end up completely garbled.  Good practice in
 Windows is to consider the A versions deprecated and avoid them like
 the plague.

 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.

 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of:
 * Accepting both mutable and immutable data.
 * Declares that the function won't mutate the data.
 * Declares that the function doesn't expect the data to be immutable.

 It would be even better to use const scope char[], declaring that a
 reference won't be kept, but it seems that scope in this context is
 deprecated.

 Once upon a time "in" meant const scope. Does anyone know what it means
 now?

 That's still what it means. scope in this context is _not_ deprecated. Only
 scoped local variables (not scoped parameters or scope statements) are
 deprecated. in would be the correct thing to use. It's used elswhere with
 strings is Phobos. And yes, as long as the strings being passed in are not
being
 mutated, then having the parameters be in is the correct thing to do.

What about 'in' as default? I think a function changing its params is a special 
case --and somewhat unsafe-- which should be clearly indicated at the interface 
level.
	void decode (S,T) (S source, mutable T target) {...}
unchanged ---------------------^

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/03/2011 05:29 PM, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

Looks very good. Including doc. A real pleasure to explore :-)

Jonathan: "I'd prefer dirName to directory."
Agreed. (The element in question is a name, not a piece of data modelling 
directory.)
[that's ~ all what I would criticize ;-)]

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 03 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 04 Mar 2011 00:42:53 +0100, spir wrote:

 On 03/03/2011 05:29 PM, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

 
 Looks very good. Including doc. A real pleasure to explore :-)

Thanks!


 Jonathan: "I'd prefer dirName to directory." Agreed. (The element in
 question is a name, not a piece of data modelling directory.)
 [that's ~ all what I would criticize ;-)]

One more vote for dirName() has been noted. :)

-Lars

Mar 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)

Meh. Since we have basename which is a replica of the homonym Unix 
command, I think dirname (with that exact spelling) would be most 
appropriate there.

Andrei

Mar 04 2011

David Nadlinger <see klickverbot.at> writes:

On 3/4/11 4:10 PM, Andrei Alexandrescu wrote:
 On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)

 Meh. Since we have basename which is a replica of the homonym Unix
 command, I think dirname (with that exact spelling) would be most
 appropriate there.

I must admit that I don't quite remember the results of the previous 
naming convention discussion regarding function names »imported« from 
other languages/systems, but my position hasn't changed since then: Just 
go with the D naming convention to make the name easy to remember/guess 
for _D programmers_, keeping in mind that DMD has a spell-checking 
feature to assist people new to D.

I'd also prefer dirName for clarity, by the way, but I don't think 
longer names are worse generally.

David

Mar 04 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 04 March 2011 07:42:38 David Nadlinger wrote:
 On 3/4/11 4:10 PM, Andrei Alexandrescu wrote:
 On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)

=20
 Meh. Since we have basename which is a replica of the homonym Unix
 command, I think dirname (with that exact spelling) would be most
 appropriate there.

=20
 I must admit that I don't quite remember the results of the previous
 naming convention discussion regarding function names =C2=BBimported=C2=

=AB from
 other languages/systems, but my position hasn't changed since then: Just
 go with the D naming convention to make the name easy to remember/guess
 for _D programmers_, keeping in mind that DMD has a spell-checking
 feature to assist people new to D.
=20
 I'd also prefer dirName for clarity, by the way, but I don't think
 longer names are worse generally.

The general consensus of the previous discussion was that we would stick to=
=20
camelcase regardless of where the original name came from. People generally=
=20
considered it harder to remember the names if you had to remember that they=
 had=20
screwy casing.

=2D Jonathan M Davis

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/04/2011 04:42 PM, David Nadlinger wrote:
 On 3/4/11 4:10 PM, Andrei Alexandrescu wrote:
 On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)

 Meh. Since we have basename which is a replica of the homonym Unix
 command, I think dirname (with that exact spelling) would be most
 appropriate there.

 I must admit that I don't quite remember the results of the previous naming
 convention discussion regarding function names »imported« from other
 languages/systems, but my position hasn't changed since then: Just go with the
 D naming convention to make the name easy to remember/guess for _D
 programmers_, keeping in mind that DMD has a spell-checking feature to assist
 people new to D.

Yes; if keep on adopting names that don't follow D's convention under the 
pretext they exist somewhere, then let's just throw the convention to the 
garbage, stop talking about such topics, and let everyone (first lib authors) 
use whatever they find nice.
"patchwork lexicon"

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 04 2011

spir <denis.spir gmail.com> writes:

On 03/03/2011 05:29 PM, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

Looks very good. Including doc. A real pleasure to explore :-)

Jonathan: "I'd prefer dirName to directory."
Agreed. (The element in question is a name, not a piece of data modelling 
directory.)
[that's ~ all what I would criticize ;-)]

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 03 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

I'm certainly all in favor of this (being the one that started the 
"std.path.getName(): Screwy by design?" thread in the first place ;) ). It's 
a huge improvement over the current std.path.

My (updated) comments:

Names:
- Given the choice between "*Separator" and "*Sep", I would lean towards 
"*Sep". But I'd be perfectly happy either way since I'll probably never use 
either of them (Like I said in the other thread, I'd rather just use 
forward-slash everywhere and convert to backslashes (probably via 
toCanonical) as-needed. So much less messy that way.)

- Regarding any other abbreviation like "extension" vs "ext" or "directory" 
vs "dir": I like them either way. It's all good.

- Need to change "basename" to "baseName". I'd actually be happier with 
"fileNameOf", but "baseName" is fine, too. No objection, as long as it's 
camel-cased.

- fcmp, fncharmatch and fnmatch are pretty awful names. If I just look at 
"fcmp" I can't tell what the hell the "f" means. Compare floating-point 
numbers? Doesn't remotely scream "compare file names".  The other two are 
worse: "fn" keeps telling my brain "function" every time I look at it even 
though I know full well I'm looking at std.path. I don't really care much 
about the exact final name, but as examples, anything along these lines 
would be good:
    fcmp -> cmpFileName
    fnmatch -> matchPath, matchGlob or maybe matchFileName
    fncharmatch -> matchPathChar, or maybe matchFileNameChar or 
matchFileChar

- Regarding "directory", I disagree with Jonathan that it sounds like it 
checks if it's a valid directory. I would *strongly expect* a function like 
that to be named "isDirectory" or "isDir". I'm happy with the name 
"directory" and prefer it to his suggestion of "dirName", *but* I would not 
object to "dirName" (as long as "drive" is changed to "driveName" for 
consistency). Something like "directoryOf" or "dirNameOf" might be even 
better still. I'd be happy with any of the above though.

Functionality:
- toCanonical needs to expand the tilde "~" path. If it already does, the 
docs should mention this.

The tilde needs to be thought out more:
- Is "~/foo" a relative path or absolute? Either way, it should be 
documented. If it's relative, then toAbsolute needs to expand it (and be 
documented as such).

- Maybe it should be "expandHomeDir" or "expandHome" instead of 
"expandTilde"?

- Windows *does* have a concept of a home dir, so maybe tilde should be 
expanded even on Windows. Only problem though is that Windows has *two* main 
home dirs for each user: %HOMEPATH% for user-created files and %APPDATA% for 
application data. (And some others, but I don't think any of the others are 
appropriate for "~") So maybe there should be these three:

        1. expandTilde: Exactly as it is now: expands ~ on posix, no-op on 
windows.

        2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the 
user's home directory. On windows: Expands "~" and "%HOMEDIR%" to whatever 
%HOMEDIR% is set to.

        3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to the 
user's home directory. On windows: Expands "~" and "%APPDATA%" to whatever 
%APPDATA% is set to.

- Speaking of %HOMEDIR% and %APPDIR%, there should be a function that 
expands all of these: 
http://technet.microsoft.com/en-us/library/cc749104(WS.10).aspx   Although I 
think those are all environment vars, so a function that just expands env 
vars might be good enough. Either way, it should definitely be called by 
toCanonical (and documented as such). Special thought would have to be given 
to how to handle this cross-platform. Maybe any ones with obvious 
equivalents on posix (like %HOMEDIR%, %APPDIR%, and %TEMP%) should be 
converted appropriately on posix. Maybe posix uses a different delimiter, 
and if so, how to handle the each delimiter on each platform should be 
thought out.

That's all I can think of right now.

Of course if your proposed module becomes the new std.path just as it is and 
the above improvements wait for another update, I'd still be happy. Heck, 
again, your module is a *huge* improvement over the current std.path even 
just as it is.

Mar 03 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 03 Mar 2011 22:51:01 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 I'm certainly all in favor of this (being the one that started the
 "std.path.getName(): Screwy by design?" thread in the first place ;) ).
 It's a huge improvement over the current std.path.

Thanks!


 My (updated) comments:
 
 Names:
 - Given the choice between "*Separator" and "*Sep", I would lean towards
 "*Sep". But I'd be perfectly happy either way since I'll probably never
 use either of them (Like I said in the other thread, I'd rather just use
 forward-slash everywhere and convert to backslashes (probably via
 toCanonical) as-needed. So much less messy that way.)
 
 - Regarding any other abbreviation like "extension" vs "ext" or
 "directory" vs "dir": I like them either way. It's all good.
 
 - Need to change "basename" to "baseName". I'd actually be happier with
 "fileNameOf", but "baseName" is fine, too. No objection, as long as it's
 camel-cased.

See my comments to Jonathan's post.


 - fcmp, fncharmatch and fnmatch are pretty awful names. If I just look
 at "fcmp" I can't tell what the hell the "f" means. Compare
 floating-point numbers? Doesn't remotely scream "compare file names". 
 The other two are worse: "fn" keeps telling my brain "function" every
 time I look at it even though I know full well I'm looking at std.path.
 I don't really care much about the exact final name, but as examples,
 anything along these lines would be good:
     fcmp -> cmpFileName
     fnmatch -> matchPath, matchGlob or maybe matchFileName fncharmatch
     -> matchPathChar, or maybe matchFileNameChar or
 matchFileChar

I agree the names are terrible, and I like your suggestions.  I suggest 
we go with cmpPath, matchPath and matchPathChar.


 - Regarding "directory", I disagree with Jonathan that it sounds like it
 checks if it's a valid directory. I would *strongly expect* a function
 like that to be named "isDirectory" or "isDir". I'm happy with the name
 "directory" and prefer it to his suggestion of "dirName", *but* I would
 not object to "dirName" (as long as "drive" is changed to "driveName"
 for consistency). Something like "directoryOf" or "dirNameOf" might be
 even better still. I'd be happy with any of the above though.

I prefer "directory" over "dirName" too, but I can live with the latter.  
You'll have a hard time convincing me to use "directoryOf" or 
"dirNameOf", though. ;)


 Functionality:
 - toCanonical needs to expand the tilde "~" path. If it already does,
 the docs should mention this.

I hadn't thought of this, good thing you brought it up.  One thing which 
should perhaps be taken into consideration is that toCanonical() as it is 
now is a rather simple in-memory string operation.  expandTilde(), on the 
other hand, does disk I/O in some cases (it does a /etc/passwd lookup).  
I'll have to think some more about this.


 The tilde needs to be thought out more: - Is "~/foo" a relative path or
 absolute? Either way, it should be documented. If it's relative, then
 toAbsolute needs to expand it (and be documented as such).

Good point.  It's definitely an absolute path, so I'll need to include 
this case in isAbsolute().  It is of course technically possible to set 
$HOME to a relative path, but it is quite pointless and not something we 
should worry about.


 - Maybe it should be "expandHomeDir" or "expandHome" instead of
 "expandTilde"?
 
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:
 
         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.
 
         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.
 
         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.

On POSIX you expect to be able to use ~ anywhere you're asked to input a 
path/filename.  Is this the case on Windows?  Can you write %HOMEDIR%
\report.doc in Word's "Open" dialog, for instance?


 - Speaking of %HOMEDIR% and %APPDIR%, there should be a function that
 expands all of these:
 http://technet.microsoft.com/en-us/library/cc749104(WS.10).aspx  
 Although I think those are all environment vars, so a function that just
 expands env vars might be good enough. Either way, it should definitely
 be called by toCanonical (and documented as such). Special thought would
 have to be given to how to handle this cross-platform. Maybe any ones
 with obvious equivalents on posix (like %HOMEDIR%, %APPDIR%, and %TEMP%)
 should be converted appropriately on posix. Maybe posix uses a different
 delimiter, and if so, how to handle the each delimiter on each platform
 should be thought out.

I agree a function that expands environment variables could be useful, 
but I don't think std.path is the right place for it.  Perhaps an 
"expand" member function of std.process.environment?


 That's all I can think of right now.

Thanks for the feedback!


 Of course if your proposed module becomes the new std.path just as it is
 and the above improvements wait for another update, I'd still be happy.
 Heck, again, your module is a *huge* improvement over the current
 std.path even just as it is.

:)

-Lars

Mar 04 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikqabr$796$4 digitalmars.com...
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:

         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.

         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.

         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.

 On POSIX you expect to be able to use ~ anywhere you're asked to input a
 path/filename.  Is this the case on Windows?  Can you write %HOMEDIR%
 \report.doc in Word's "Open" dialog, for instance?

No, it's just an environment variable. In fact, it seems that % is a valid 
filename character (I wouldn't have even guessed that), so expanding any of 
the %BLAH% stuff in std.path is probably a bad idea after all.

The expandTilde/expandHomeDir/expandAppDataDir working on *just* tilde might 
be a good idea though. Although maybe it would be better to just have 
expandTilde and then add these two functions instead:

  - getHomeDir(): Posix: Returns expanded form of "~". Widnows: Returns 
expanded form of "%HOMEDIR%"

  - getAppDataDir(): Posix: Returns expanded form of "~". Widnows: Returns 
expanded form of "%APPDATA%"

Mar 04 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Fri, 04 Mar 2011 10:13:04 -0000, Nick Sabalausky <a a.a> wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikqabr$796$4 digitalmars.com...
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:

         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.

         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.

         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.

 On POSIX you expect to be able to use ~ anywhere you're asked to input a
 path/filename.  Is this the case on Windows?  Can you write %HOMEDIR%
 \report.doc in Word's "Open" dialog, for instance?

 No, it's just an environment variable.

Actually, you can.  I just tried Textpad and Word 2010 and both accepted  
me typing:

%HOMEDRIVE%%HOMEPATH%\ (at this point they both bring up suggestions)
%APPDATA%\ (at this point they both bring up suggestions)

FYI.. my environment variables are:

APPDATA=C:\Users\rheath.<domain>\AppData\Roaming
HOMEDRIVE=C:
HOMEPATH=\Users\rheath.<domain>

I don't have HOMEDIR, .. this is on Windows 7 x64 BTW.

 In fact, it seems that % is a valid
 filename character (I wouldn't have even guessed that), so expanding any  
 of
 the %BLAH% stuff in std.path is probably a bad idea after all.

Not necessarily, but it might require a bit more double-checking, for  
example..

If you type the following at command prompt you get an error.
copy con test%HOMEDRIVE%.txt

"The filename, directory name, or volume label syntax is incorrect."

Because %HOMEDRIVE% is expanded to C: and testC:.txt is invalid.

But these both work:
copy con test%HOMEDRIVE.txt (missing 2nd %)
copy con test%HOMEDRIV%.txt (non-existant envvar)

In the latter case you actually get a file named "test%HOMEDRIV%.txt", it  
hasn't attempted to replace the non-existant envvar with a blank string,  
as that would result in "test.txt".

R

p.s. "copy con" means copy console, type something, then press ctrl+z to  
mark EOF.

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 04 2011

"Nick Sabalausky" <a a.a> writes:

"Regan Heath" <regan netmail.co.nz> wrote in message 
news:op.vrtj9iz454xghj puck.auriga.bhead.co.uk...
 On Fri, 04 Mar 2011 10:13:04 -0000, Nick Sabalausky <a a.a> wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikqabr$796$4 digitalmars.com...
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:

         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.

         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.

         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.

 On POSIX you expect to be able to use ~ anywhere you're asked to input a
 path/filename.  Is this the case on Windows?  Can you write %HOMEDIR%
 \report.doc in Word's "Open" dialog, for instance?

 No, it's just an environment variable.

 Actually, you can.  I just tried Textpad and Word 2010 and both accepted 
 me typing:

 %HOMEDRIVE%%HOMEPATH%\ (at this point they both bring up suggestions)
 %APPDATA%\ (at this point they both bring up suggestions)

 FYI.. my environment variables are:

 APPDATA=C:\Users\rheath.<domain>\AppData\Roaming
 HOMEDRIVE=C:
 HOMEPATH=\Users\rheath.<domain>

 I don't have HOMEDIR, .. this is on Windows 7 x64 BTW.

Oh, you're right. It's the same for me on XP. I must have misread the doc 
page: There is no %HOMEDIR%, that's why it didn't work when I tried it in 
notepad. The correct thing is %HOMEDRIVE%%HOMEPATH%. That works for me, and 
so does %APPDATA%.


 In fact, it seems that % is a valid
 filename character (I wouldn't have even guessed that), so expanding any 
 of
 the %BLAH% stuff in std.path is probably a bad idea after all.

 Not necessarily, but it might require a bit more double-checking, for 
 example..

 If you type the following at command prompt you get an error.
 copy con test%HOMEDRIVE%.txt

 "The filename, directory name, or volume label syntax is incorrect."

 Because %HOMEDRIVE% is expanded to C: and testC:.txt is invalid.

 But these both work:
 copy con test%HOMEDRIVE.txt (missing 2nd %)
 copy con test%HOMEDRIV%.txt (non-existant envvar)

 In the latter case you actually get a file named "test%HOMEDRIV%.txt", it 
 hasn't attempted to replace the non-existant envvar with a blank string, 
 as that would result in "test.txt".

FWIW, I just did a little test to see if the substitution is being done by 
the commandline itself or by the command being run. Seems to be the 
commandline itself doing the substitution. Make a little echo program in D:

import std.stdio;
void main(string[] args) {
    writeln(args[1]);
}

dmd main.d
main %APPDATA%

C:\Documents and Settings\Nick Sabalausky\Application Data

Next thing to test would be the file I/O API. I'm wondering if passing 
"%APPDATA%" directly to the file I/O routines would be taken literally or 
get automatically expanded. I would think it would be taken literally, but 
with all the magic that windows does, I'm not so sure. Don't have time to 
test it at the moment.

Mar 04 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-03 17:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

How about functions for getting common directories like the home and 
temp directory.

-- 
/Jacob Carlborg

Mar 04 2011

J Chapman <j ch.com> writes:

== Quote from Jacob Carlborg (doob me.com)'s article
 On 2011-03-03 17:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

 How about functions for getting common directories like the home and
 temp directory.

They'd belong in a separate module - std.environment?

Mar 04 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-04 11:31, J Chapman wrote:
 == Quote from Jacob Carlborg (doob me.com)'s article
 On 2011-03-03 17:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

       http://kyllingen.net/code/ltk/doc/path.html
       https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars

 How about functions for getting common directories like the home and
 temp directory.

 They'd belong in a separate module - std.environment?

Yeah, that might be a better idea.

-- 
/Jacob Carlborg

Mar 04 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Friday 04 March 2011 00:25:31 spir wrote:
 On 03/04/2011 07:17 AM, Jonathan M Davis wrote:
 On Thursday 03 March 2011 19:23:33 Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].

 
 Please don't ever restrict encodings like that.  As much as possible,
 libraries should seek to be encoding agnostic (though I'm all for
 const-qualifying parameters).  This is one area where I feel the
 standard library severely lacks at present.
 
 As a Windows developer, I prefer to use wchar strings by default and
 use only the W versions of the Windows API functions, because the A
 versions severely limit functionality.  Only the W versions have full
 support for Unicode; the A versions are entirely dependent on the
 current (8-bit) code page.  This means no support for UNC paths or
 paths longer than 260 characters, and also means that international
 characters commonly end up completely garbled.  Good practice in
 Windows is to consider the A versions deprecated and avoid them like
 the plague.

 
 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.
 
 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of:
 * Accepting both mutable and immutable data.
 * Declares that the function won't mutate the data.
 * Declares that the function doesn't expect the data to be immutable.
 
 It would be even better to use const scope char[], declaring that a
 reference won't be kept, but it seems that scope in this context is
 deprecated.
 
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?

 
 That's still what it means. scope in this context is _not_ deprecated.
 Only scoped local variables (not scoped parameters or scope statements)
 are deprecated. in would be the correct thing to use. It's used elswhere
 with strings is Phobos. And yes, as long as the strings being passed in
 are not being mutated, then having the parameters be in is the correct
 thing to do.

 
 What about 'in' as default? I think a function changing its params is a
 special case --and somewhat unsafe-- which should be clearly indicated at
 the interface level.
 	void decode (S,T) (S source, mutable T target) {...}
 unchanged ---------------------^

That's now how D works. That's not how it well ever work. Every time such an 
idea hos been brought up to Walter (and probably any of the major developers
for 
that matter), it has been shot down. It's too big a departure from C, C++,
Java, 

instead of marking stuff as const or in a decent chunk of the time, you're
going 
to have to mark stuff as mutable a decent chunk of the time - possibly more. It 
would confuse most programmers to no benefit. You'd just be trading a common, 
known default to a uncommon, strange one (for C-based languages anyway) and 
changing which variables you had to mark with what. You're still going to have 
to mark plenty of variables as something other the default.

So, no. in will never be the default.

- Jonathan M Davis

- Jonathan M Davis

Mar 04 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

I don't want to jinx it, but there seems to be a lot of agreement in this 
thread. Seriously, how often does that happen around here? :)

Mar 04 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

Not too often, so I take it as a good sign that I'm onto something. ;)

The only disagreement seems to be about the naming, so let's have a round 
of voting.  Here are a few alternatives for each function.  Please say 
which ones you prefer.

 * dirSeparator, dirSep, sep
 * currentDirSymbol, currentDirSym, curDirSymbol
 * basename, baseName, filename, fileName
 * dirname, dirName, directory, getDir, getDirName
 * drivename, driveName, drive, getDrive, getDriveName
 * extension, ext, getExt, getExtension
 * stripExtension, stripExt

(The same convention will be used for stripExtension, replaceExtension 
and defaultExtension.)

-Lars

Mar 05 2011

Bekenn <leaveme alone.com> writes:

dirSeparator	-- I'd actually prefer pathSeparator, but that's not on the 
list.
currentDirSymbol
baseName
dirName
driveName
extension
stripExtension


Abbrvs impr rdblty.

Mar 05 2011

Jim <bitcirkel yahoo.com> writes:

Bekenn Wrote:

 dirSeparator	-- I'd actually prefer pathSeparator, but that's not on the 
 list.
 currentDirSymbol
 baseName
 dirName
 driveName
 extension
 stripExtension


++vote

...except that I like the current distinction between pathSeparator and
dirSeparator as it is. pathSeparator should divide paths not directories.

Mar 05 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:iktojn$go0$1 digitalmars.com...
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 Not too often, so I take it as a good sign that I'm onto something. ;)

 The only disagreement seems to be about the naming, so let's have a round
 of voting.  Here are a few alternatives for each function.  Please say
 which ones you prefer.

 * dirSeparator, dirSep, sep

dirSep, But I'd be fine with the others too.


 * currentDirSymbol, currentDirSym, curDirSymbol

currDirSymbol, But I'd be fine with the others too.


 * basename, baseName, filename, fileName

baseName or baseFileName

Definitely not 'filename' because I frequently use that as a variable name.

Definitely not 'basename' because it's not camel-cased, and because the fact 
that there's a unix command named 'basename' is completely irrelevent. 
Patchwork naming "convention" is idiotic.

And I'm uncomfortable with fileName because despite it being much more 
descriptive than baseName, it's too close to what I'd use as a common 
variable name.


 * dirname, dirName, directory, getDir, getDirName

dirName or directory. But anything except 'dirname' is fine.


 * drivename, driveName, drive, getDrive, getDriveName

driveName or drive. But anything except 'drivename' is fine.


 * extension, ext, getExt, getExtension

ext. But the others are fine, too.


 * stripExtension, stripExt

stripExt, But either one is fine.

Well now everyone, I think that I would have to have to say to all of the 
people here in this newsgroup that excess verbosity can and does (and would 
continue to) harm readability every last bit as much as having 2 mny abbrs 
wuld harm the readability of the name of a variable, or a function or really 
any other custom-named identifier that may or may not exist in D, or in 
Phobos, or in any code written in D, or really any other langauge regardless 
if it happens to be a programming language or some other sort of a language 
such as a human language.

Mar 05 2011

spir <denis.spir gmail.com> writes:

On 03/05/2011 09:57 PM, Nick Sabalausky wrote:
 * currentDirSymbol, currentDirSym, curDirSymbol

 currDirSymbol, But I'd be fine with the others too.

"currDirSymbol" not on the list ;-)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 05 2011

"Nick Sabalausky" <a a.a> writes:

"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2213.1299361218.4748.digitalmars-d puremagic.com...
 On 03/05/2011 09:57 PM, Nick Sabalausky wrote:
 * currentDirSymbol, currentDirSym, curDirSymbol

 currDirSymbol, But I'd be fine with the others too.

 "currDirSymbol" not on the list ;-)

I deliberately added it :)  I think it's better than "curDirSymbol" (but 
like I said, I can go either way.)

Mar 05 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Without even looking at any posts in this discussion, what is a
directory *symbol* anyway?

Mar 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 05 March 2011 17:22:01 Andrej Mitrovic wrote:
 Without even looking at any posts in this discussion, what is a
 directory *symbol* anyway?

currDirSym would be ".", and parentDirSym would "..". It's what you use when 
navigating directories backwards. It's quite clear if you look at the docs.

- Jonathan M Davis

Mar 05 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I dunno, maybe I'd prefer an enum.

enum path : string { current = ".", up = ".." };

main() { string newPath = join("C:", "Windows", "Subdir", path.up,
path.up, "Program Files");
newPath == r"C:\Windows\Subdir\..\..\Program Files";

This is just nitpicking however. And 'current' is only used on Linux afaik? :)

On 3/6/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Saturday 05 March 2011 17:22:01 Andrej Mitrovic wrote:
 Without even looking at any posts in this discussion, what is a
 directory *symbol* anyway?

 currDirSym would be ".", and parentDirSym would "..". It's what you use when
 navigating directories backwards. It's quite clear if you look at the docs.

 - Jonathan M Davis

Mar 05 2011

"Nick Sabalausky" <a a.a> writes:

"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.2230.1299375838.4748.digitalmars-d puremagic.com...
I dunno, maybe I'd prefer an enum.

 enum path : string { current = ".", up = ".." };

 main() { string newPath = join("C:", "Windows", "Subdir", path.up,
 path.up, "Program Files");
 newPath == r"C:\Windows\Subdir\..\..\Program Files";

 This is just nitpicking however. And 'current' is only used on Linux 
 afaik? :)

Windows has always had the '.' meaning "current directory". Even early 
versions of MS-DOS had it.

Mar 05 2011

spir <denis.spir gmail.com> writes:

On 03/06/2011 01:35 AM, Nick Sabalausky wrote:
 "spir"<denis.spir gmail.com>  wrote in message
 news:mailman.2213.1299361218.4748.digitalmars-d puremagic.com...
 On 03/05/2011 09:57 PM, Nick Sabalausky wrote:
 * currentDirSymbol, currentDirSym, curDirSymbol

 currDirSymbol, But I'd be fine with the others too.

 "currDirSymbol" not on the list ;-)

 I deliberately added it :)  I think it's better than "curDirSymbol" (but
 like I said, I can go either way.)

I agree with you and Jonathan about that point. Also  find that 'dir' is 
enough, esp in context, because it can hardly be misinterpreted. And it's very 
used in programming (not only a pair of PLs) and in computer use in general. 
Thus, it's one rare case where I find abbr ok.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/05/2011 05:32 PM, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad"<public kyllingen.NOSPAMnet>  wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 Not too often, so I take it as a good sign that I'm onto something. ;)

 The only disagreement seems to be about the naming, so let's have a round
 of voting.  Here are a few alternatives for each function.  Please say
 which ones you prefer.

   * dirSeparator, dirSep, sep

dirSep
   * currentDirSymbol, currentDirSym, curDirSymbol

currentDirSymbol
   * basename, baseName, filename, fileName

baseName, fileName
   * dirname, dirName, directory, getDir, getDirName

dirName, getDirName
   * drivename, driveName, drive, getDrive, getDriveName

driveName, getDriveName
   * extension, ext, getExt, getExtension
   * stripExtension, stripExt
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)

don't mind

About "xyz" vs "xyzName": the point is what is denoted /is/ a name. It's not a 
programming object modelling a directory or a drive.

 -Lars

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 05 2011

J Chapman <johnch_atms hotmail.com> writes:

== Quote from Lars T. Kyllingstad (public kyllingen.NOSPAMnet)'s article
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 Not too often, so I take it as a good sign that I'm onto something. ;)
 The only disagreement seems to be about the naming, so let's have a round
 of voting.  Here are a few alternatives for each function.  Please say
 which ones you prefer.
  * dirSeparator, dirSep, sep

dirSeparator

  * currentDirSymbol, currentDirSym, curDirSymbol

currentDirSymbol

  * basename, baseName, filename, fileName

baseName (but prefer getBaseName for consistency with below)

  * dirname, dirName, directory, getDir, getDirName

getDirName

  * drivename, driveName, drive, getDrive, getDriveName

getDriveName

  * extension, ext, getExt, getExtension

getExtension

  * stripExtension, stripExt

stripExtension (but prefer removeExtension)

 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)
 -Lars

Mar 05 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Please no repetitions in consonants, e.g. "curr". That's something
I'll keep screwing up when typing, and all I'll get back is "no
curDirName in main.d", or "symbol not found", etc..

Mar 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 05 March 2011 14:07:44 Andrej Mitrovic wrote:
 Please no repetitions in consonants, e.g. "curr". That's something
 I'll keep screwing up when typing, and all I'll get back is "no
 curDirName in main.d", or "symbol not found", etc..

LOL. Whereas I'd argue that there _should_ be a repetition in consonants if the 
word that's being abbreviated has a double consonant. Otherwise, it looks like 
it's spelled wrong, and _I_ for one would be constantly mis-typing it.

However, regardless of which way it goes, I _would_ point out that you wouldn't 
get an error message as bad as you suggest. It should be asking you if you
meant 
X (where X is whatever the correct spelling is) instead, unlike languages like 
C++ or Java normally do.

- Jonathan M Davis

Mar 05 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/5/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 However, regardless of which way it goes, I _would_ point out that you
 wouldn't
 get an error message as bad as you suggest. It should be asking you if you
 meant
 X (where X is whatever the correct spelling is) instead, unlike languages
 like
 C++ or Java normally do.

 - Jonathan M Davis

Oh yeah, I forgot about DMD's semi-recent inclusion of "did you
mean..?" error messages. They're actually quite useful, and I'd wish
Optlink was the same; "Oh, did you mean to link in mylibrary, not my
liblaly.lib?"

Mar 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 
 Not too often, so I take it as a good sign that I'm onto something. ;)
 
 The only disagreement seems to be about the naming, so let's have a round
 of voting.  Here are a few alternatives for each function.  Please say
 which ones you prefer.
 
  * dirSeparator, dirSep, sep

dirSep and pathSep. Having Separator in the name is unnecessarily long.

  * currentDirSymbol, currentDirSym, curDirSymbol

currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if 
abbreviating both current and symbol is too much). Shorter but still quite 
clear.

I would _definitely_ use two r's when abbreviating current though, since
current 
has two r's. I confess that it' a major pet peeve of mine when I see current 
abbreviate with one r. It feels like it's being spelled wrong, since current
has 
two r's.

  * basename, baseName, filename, fileName

baseName

  * dirname, dirName, directory, getDir, getDirName

dirName

  * drivename, driveName, drive, getDrive, getDriveName

driveLetter would probably be better actually - though it _could_ be more than 
one letter if someone has an insane number of drives (it's usually referred to 
as a drive letter though). Barring that, drive would be fine (as long as it's a 
property).

  * extension, ext, getExt, getExtension
  * stripExtension, stripExt
 
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)

I'm a bit torn between extension and ext  -I'd like ext but am afraid it's a
bit 
too short for clarity. However, I _do_ think that all of the names which use 
Extension as a prefix should use Ext instead. It's much shorter and still quite 
clear.

- Jonathan M Davis

Mar 05 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sat, 05 Mar 2011 14:33:07 -0800, Jonathan M Davis wrote:

 On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread,
 I started working on a rewrite of std.path a long time ago, but I
 got sidetracked by other things.  The recent discussion got me
 working on it again, and it turned out there wasn't that much left
 to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 
 Not too often, so I take it as a good sign that I'm onto something. ;)
 
 The only disagreement seems to be about the naming, so let's have a
 round of voting.  Here are a few alternatives for each function. 
 Please say which ones you prefer.
 
  * dirSeparator, dirSep, sep

 
 dirSep and pathSep. Having Separator in the name is unnecessarily long.
 
  * currentDirSymbol, currentDirSym, curDirSymbol

 
 currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if
 abbreviating both current and symbol is too much). Shorter but still
 quite clear.
 
 I would _definitely_ use two r's when abbreviating current though, since
 current has two r's. I confess that it' a major pet peeve of mine when I
 see current abbreviate with one r. It feels like it's being spelled
 wrong, since current has two r's.
 
  * basename, baseName, filename, fileName

 
 baseName
 
  * dirname, dirName, directory, getDir, getDirName

 
 dirName
 
  * drivename, driveName, drive, getDrive, getDriveName

 
 driveLetter would probably be better actually - though it _could_ be
 more than one letter if someone has an insane number of drives (it's
 usually referred to as a drive letter though). Barring that, drive would
 be fine (as long as it's a property).

Interestingly, it seems drive names are actually restricted to one 
letter.  See the last paragraph of this section:

http://en.wikipedia.org/wiki/Drive_letter#Common_assignments

-Lars

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 04:11:35 Lars T. Kyllingstad wrote:
 On Sat, 05 Mar 2011 14:33:07 -0800, Jonathan M Davis wrote:
 On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread,
 I started working on a rewrite of std.path a long time ago, but I
 got sidetracked by other things.  The recent discussion got me
 working on it again, and it turned out there wasn't that much left
 to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 
 Not too often, so I take it as a good sign that I'm onto something. ;)
 
 The only disagreement seems to be about the naming, so let's have a
 round of voting.  Here are a few alternatives for each function.
 Please say which ones you prefer.
 
  * dirSeparator, dirSep, sep

 
 dirSep and pathSep. Having Separator in the name is unnecessarily long.
 
  * currentDirSymbol, currentDirSym, curDirSymbol

 
 currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if
 abbreviating both current and symbol is too much). Shorter but still
 quite clear.
 
 I would _definitely_ use two r's when abbreviating current though, since
 current has two r's. I confess that it' a major pet peeve of mine when I
 see current abbreviate with one r. It feels like it's being spelled
 wrong, since current has two r's.
 
  * basename, baseName, filename, fileName

 
 baseName
 
  * dirname, dirName, directory, getDir, getDirName

 
 dirName
 
  * drivename, driveName, drive, getDrive, getDriveName

 
 driveLetter would probably be better actually - though it _could_ be
 more than one letter if someone has an insane number of drives (it's
 usually referred to as a drive letter though). Barring that, drive would
 be fine (as long as it's a property).

 
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:
 
 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments

I could have sworn that I'd seen something which allowed you to assign two-
letter names to drives instead of just one...

Oh well, it's not like two-letter drive names would be common anyway. That just 
seems like driveLetter is that much better a name though - especially since 
driveLetter is unambiguously a Windows thing then as opposed to some general
HDD 
thing.

- Jonathan M Davis

Mar 06 2011

Bekenn <leaveme alone.com> writes:

On 3/6/2011 4:11 AM, Lars T. Kyllingstad wrote:
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:

 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments

 -Lars

Correct.  However, the rules change for UNC paths: 
http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 18:46:15 Bekenn wrote:
 On 3/6/2011 4:11 AM, Lars T. Kyllingstad wrote:
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:
 
 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments
 
 -Lars

 
 Correct.  However, the rules change for UNC paths:
 http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx

Now, _that_ is a great link. There's lots of good information there. Thanks!

- Jonathan m Davis

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

"Bekenn" <leaveme alone.com> wrote in message 
news:il1h39$19p5$2 digitalmars.com...
 On 3/6/2011 4:11 AM, Lars T. Kyllingstad wrote:
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:

 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments

 -Lars

 Correct.  However, the rules change for UNC paths: 
 http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx

Great link! I can't believe how much is in there that I never even had the 
slightest clue about. The '//?/' and '//./' are *completely* new to me, and 
I've been a windows guy since 3.11.

I think these parts are particularly relevent to our discussion here:

--------------------------------------------------
Do not end a file or directory name with a space or a period. Although the 
underlying file system may support such names, the Windows shell and user 
interface does not. However, it is acceptable to specify a period as the 
first character of a name. For example, ".temp".
--------------------------------------------------

This implies three things:

1. The windows shell and UI are shitty

2. The windows filesystem *does* allow files that end in '.' just lke unix, 
despite the windows shell and UI being too stupid to handle them right.

3. *Even on windows* something that starts with a dot is to be considered a 
filename, not a nameless file with an extension.

--------------------------------------------------
File I/O functions in the Windows API convert "/" to "\" as part of 
converting the name to an NT-style name, except when using the "\\?\" prefix 
as detailed in the following sections.
--------------------------------------------------

Ie, WinAPI automatically accepts *both* slashes and backslashes as the 
directory separator. Although lower-level stuff may expect backslashes.

-------------------------------------------------- 
{almost everything else}
--------------------------------------------------

Implies:

1. The ANSI/ASCII APIs should just simply *never* be used.

2. Handling all paths properly on windows is a royal fucking PITA.

Mar 07 2011

Bekenn <leaveme alone.com> writes:

On 3/7/2011 2:30 PM, Nick Sabalausky wrote:
 --------------------------------------------------
 {almost everything else}
 --------------------------------------------------

 Implies:

 1. The ANSI/ASCII APIs should just simply *never* be used.

This right here is something that I think needs to be drilled into every 
potential Windows programmer out there.  The underlying file system 
usually encodes file names in Unicode, which provides great flexibility. 
  The ANSI versions of Windows API functions *cannot* handle that.  It 
is therefore impossible to guarantee that you can handle a valid Windows 
file path using the ANSI version of a function.

ANSI versions exist /for backwards compatibility only/.  New 
functionality is often introduced without even providing an ANSI version 
of the function.  Just simply do not use ANSI functions.

Mar 07 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sat, 05 Mar 2011 16:32:55 +0000, Lars T. Kyllingstad wrote:

 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 
 Not too often, so I take it as a good sign that I'm onto something. ;)
 
 The only disagreement seems to be about the naming, so let's have a
 round of voting.  Here are a few alternatives for each function.  Please
 say which ones you prefer.
 
  * dirSeparator, dirSep, sep
  * currentDirSymbol, currentDirSym, curDirSymbol * basename, baseName,
  filename, fileName * dirname, dirName, directory, getDir, getDirName *
  drivename, driveName, drive, getDrive, getDriveName * extension, ext,
  getExt, getExtension * stripExtension, stripExt
 
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)


In summary, it seems currentDirSymbol, baseName, dirName and driveName 
are clear winners.  Less clear, but still voted for by the majority, are 
extension and stripExtension.  It is a tie between dirSep and 
dirSeparator.

Below are the votes I counted.  And before you say "hey, I didn't know we 
could make suggestions of our own", or "why did that guy get several 
votes?", this was by no means a formal vote.  It was just trying to get a 
feel for people's preferences.  Before the module gets accepted into 
Phobos there will have to be a formal review process, so there is still a 
lot of opportunity to fight over naming. :)

dirSep: 3 (Nick Sabalausky, spir, Jonathan M. Davis)
dirSeparator: 3 (Bekenn, Jim, J Chapman)

currDirSym: 1 (Jonathan M. Davis)
currDirSymbol: 2 (Nick Sabalausky, Jonathan M. Davis)
path.current: 1 (Andrej Mitrovic)
currentDirSymbol: 4 (Bekenn, Jim, J Chapman, spir)

baseName: 6 (Nick Sabalausky, Bekenn, Jim, J Chapman, spir, Jonathan M. 
Davis)
baseFileName: 1 (Nick Sabalausky)
fileName: 1 (spir)
basename: 1 (Andrei Alexandrescu)

dirName: 6 (Nick Sabalausky, Bekenn, Jim, spir, Jonathan M. Davis, David 
Nadlinger)
directory: 1 (Nick Sabalausky)
getDirName: 2 (J Chapman, spir)
dirname: 1 (Andrei Alexandrescu)

driveName: 4 (Nick Sabalausky, Bekenn, Jim, spir)
drive: 2 (Nick Sabalausky, Jonathan M. Davis)
getDriveName: 2 (J Chapman, spir)
driveLetter: 1 (Jonathan M. Davis)

ext: 1 (Nick Sabalausky)
extension: 2 (Bekenn, Jim)
getExtension: 1 (J Chapman)

stripExt: 2 (Nick Sabalausky, Jonathan M. Davis)
stripExtension: 3 (Bekenn, Jim, J Chapman)

Mar 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/6/11 6:31 AM, Lars T. Kyllingstad wrote:
 In summary, it seems currentDirSymbol, baseName, dirName and driveName
 are clear winners.  Less clear, but still voted for by the majority, are
 extension and stripExtension.  It is a tie between dirSep and
 dirSeparator.

 Below are the votes I counted.  And before you say "hey, I didn't know we
 could make suggestions of our own", or "why did that guy get several
 votes?", this was by no means a formal vote.  It was just trying to get a
 feel for people's preferences.  Before the module gets accepted into
 Phobos there will have to be a formal review process, so there is still a
 lot of opportunity to fight over naming. :)

I think whatever you choose will not please everybody, so just choose 
something and stick with it. Regarding all the extension naming stuff, I 
suggest you go with the "suffix" nomenclature which is more general and 
applicable to all OSs.

Regarding semantics, consistently strip the trailing slash. It is 
unequivocally the best semantics (and incidentally or not it's what 
Unix's dirname and basename do). If rsync et al need it, they can always 
look for it in the initial parameter. The reality of the matter is that 
you will never be able to accommodate all use cases there are with 
maximum convenience.

You may want to prepare this for review after April 1st, when the review 
for std.parallelism ends. There is good signal in the exchange so far, 
but from here on this discussion could go on forever and shift focus 
away from std.parallelism.


Andrei

Mar 06 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 06 Mar 2011 09:29:27 -0600, Andrei Alexandrescu wrote:

 On 3/6/11 6:31 AM, Lars T. Kyllingstad wrote:
 In summary, it seems currentDirSymbol, baseName, dirName and driveName
 are clear winners.  Less clear, but still voted for by the majority,
 are extension and stripExtension.  It is a tie between dirSep and
 dirSeparator.

 Below are the votes I counted.  And before you say "hey, I didn't know
 we could make suggestions of our own", or "why did that guy get several
 votes?", this was by no means a formal vote.  It was just trying to get
 a feel for people's preferences.  Before the module gets accepted into
 Phobos there will have to be a formal review process, so there is still
 a lot of opportunity to fight over naming. :)

 
 I think whatever you choose will not please everybody, so just choose
 something and stick with it. Regarding all the extension naming stuff, I
 suggest you go with the "suffix" nomenclature which is more general and
 applicable to all OSs.

I don't agree.  A suffix can be anything, and we already have functions 
in std.algorithm, std.array and std.string to deal with the general 
case.  Like it or not, filename extensions are still the main method for 
conveying file type information on Windows (and even to some extent on 
Linux and OSX).  I think that's a good reason to include support for 
manipulating extensions in std.path.


 Regarding semantics, consistently strip the trailing slash. It is
 unequivocally the best semantics (and incidentally or not it's what
 Unix's dirname and basename do). If rsync et al need it, they can always
 look for it in the initial parameter. The reality of the matter is that
 you will never be able to accommodate all use cases there are with
 maximum convenience.

I agree, and that's how I've done it.


 You may want to prepare this for review after April 1st, when the review
 for std.parallelism ends. There is good signal in the exchange so far,
 but from here on this discussion could go on forever and shift focus
 away from std.parallelism.

Absolutely.  This was only intended as informal discussion, and not as a 
start on the formal review.

-Lars

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 07:29:27 Andrei Alexandrescu wrote:
 I think whatever you choose will not please everybody, so just choose
 something and stick with it. Regarding all the extension naming stuff, I
 suggest you go with the "suffix" nomenclature which is more general and
 applicable to all OSs.

I agree with Lars on this one. Everyone knows what an extension is. It's a 
universal concept even if it's not used as much on non-Windows OSes. There
_are_ 
plenty of programs in *nix which use it internally (likely because it's a lot 
easier than dealing with mime type) even if they shouldn't. "suffix" instead of 
"extension" or "ext" would be a lot less clear to most people and add pretty 
much no benefit.

 You may want to prepare this for review after April 1st, when the review
 for std.parallelism ends. There is good signal in the exchange so far,
 but from here on this discussion could go on forever and shift focus
 away from std.parallelism.

I agree that we've probably gotten as much out of the discussion of std.path as 
we could reasonably get prior to a full review, so continuing a major
discussion 
in this thread is likely unwarranted. However, are you indicating that we
should 
never have more than one module in review at a time? I see some benefit in 
spreading them out, on the other hand, if we have multiple modules ready for 
review, it seems like we could be slowing down progress unnecessarily if we 
ruled that we could only ever have one module under review at a time.

As for std.parallelism, I fear that that is the sort of module which is going
to 
get close examination by a few people and most others will either ignore
because 
they don't really intend to use it or because they fear that it will be too 
complicated to look at and review (especially if they're not all that well-
versed in threading). So, I'm not sure how much of an in-depth examination it's 
going to get by the group at large. Which reminds me, I still need to go check 
it out...

- Jonathan M Davis

Mar 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

 However, are you indicating that we should
 never have more than one module in review at a time? I see some benefit in
 spreading them out, on the other hand, if we have multiple modules ready for
 review, it seems like we could be slowing down progress unnecessarily if we
 ruled that we could only ever have one module under review at a time.

We should have only one review at a time. That way each review will be 
thorough. Boost does that, and I don't want to mess with success - 
particularly since the Boost community is larger too.

Andrei

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 17:35:32 Andrei Alexandrescu wrote:
 However, are you indicating that we should
 never have more than one module in review at a time? I see some benefit
 in spreading them out, on the other hand, if we have multiple modules
 ready for review, it seems like we could be slowing down progress
 unnecessarily if we ruled that we could only ever have one module under
 review at a time.

 
 We should have only one review at a time. That way each review will be
 thorough. Boost does that, and I don't want to mess with success -
 particularly since the Boost community is larger too.

In the general case, that seems like a good idea. I just don't want to get in a 
situation where we have several modules in the queue which are ready for review 
but have to wait a month or two, because another module is under review. In the 
case of std.path, that could mean that we'll have to wait nearly a month to get 
it in. That will likely push it back a whole release. So, I have mixed feelings 
on the matter. In principle, having only one module in review at a time is a 
good idea, but I fear that it will slow down our progress unnecessarily.

Still, if that's what you want to do, we might as well go forward with it for 
now and review that decision if we end up with too many items on the back
burner 
awaiting review. While there does appear to have been a bit of an uptick on 
possible modules for review of late, we haven't exactly had tons of them being 
put forth for review yet either.

- Jonathan M Davis

Mar 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/6/11 8:03 PM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 17:35:32 Andrei Alexandrescu wrote:
 However, are you indicating that we should
 never have more than one module in review at a time? I see some benefit
 in spreading them out, on the other hand, if we have multiple modules
 ready for review, it seems like we could be slowing down progress
 unnecessarily if we ruled that we could only ever have one module under
 review at a time.

 We should have only one review at a time. That way each review will be
 thorough. Boost does that, and I don't want to mess with success -
 particularly since the Boost community is larger too.

 In the general case, that seems like a good idea. I just don't want to get in a
 situation where we have several modules in the queue which are ready for review
 but have to wait a month or two, because another module is under review. In the
 case of std.path, that could mean that we'll have to wait nearly a month to get
 it in. That will likely push it back a whole release. So, I have mixed feelings
 on the matter. In principle, having only one module in review at a time is a
 good idea, but I fear that it will slow down our progress unnecessarily.

 Still, if that's what you want to do, we might as well go forward with it for
 now and review that decision if we end up with too many items on the back
burner
 awaiting review. While there does appear to have been a bit of an uptick on
 possible modules for review of late, we haven't exactly had tons of them being
 put forth for review yet either.

 - Jonathan M Davis

Yah, thing is people work on stuff they care about, not the most urgent 
stuff - surprise! :o) As such we don't have a ton of proposals for 
networking and xml, but we do have one (and I won't argue it's a bad 
one) for rehashing a module that basically worked.


Andrei

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.

And it doesn't help that the people who may need a particular module aren't 
necessarily the same folks with the time and know-how to actually implement 
it...

In any case, I think that it's safe to say that we can go forward with a "one 
review at a time" policy for now and revisit it if it ever becomes a problem. 
While I don't like the fact that std.path will be delayed, the occasional delay 
of a single module likely isn't a big deal. If we actually start get enough 
modules proposed for review that we actually get a bit of a queue going, _then_ 
it could be a problem. But until that happens, there isn't really much sense in 
worrying about it.

I _was_ thinking of putting forward a new proposal which includes the unit 
testing functionality that assertPred had which won't end up in an improved 
assert, so having to wait for both std.parallelism and std.path to be fully 
reviewed is bit annoying, but it's not exactly urgent. It can wait if it has
to. 
But both that and std.path may be able to have shorter review cycles than more 
complex proposals, simply because they're not as complex. Stuff like 
std.parallelism needs a thorough review. Stuff like std.path needs to be well-
reviewed, but it doesn't really need as thorough a review, since it's much 
simpler functionality. So, if we end up with several smaller items for review, 
we may be able to move through those faster than several large ones anyway, and 
large ones are likely to be rarer simply due to the amount of work involved.

In any case, we can go forward as you suggest with the "one review at time" 
policy and work with that if and until it becomes a problem.

- Jonathan M Davis

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...
 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.


I'm not sure I'd say the current std.path "basically works", but I get what 
you mean.

 I _was_ thinking of putting forward a new proposal which includes the unit
 testing functionality that assertPred had which won't end up in an 
 improved
 assert,

Speaking of which: Now that assertPred has been rejected on the grounds of 
an improved assert that doesn't yet exist, what is the current status of the 
improved assert?

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...
 
 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.


 
 I'm not sure I'd say the current std.path "basically works", but I get what
 you mean.
 
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,

 
 Speaking of which: Now that assertPred has been rejected on the grounds of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

There's an enhancement request for it:

http://d.puremagic.com/issues/show_bug.cgi?id=5547

I have no idea of any work is actually being done on it or not. It hasn't 
actually been assigned to anyone yet, for whatever that's worth. Honestly, it 
wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone 
who is capable of doing it is particularly motivated to do it (though I'm not 
sure that they're _not_ either). It was clear that a number of people wanted 
assert to be smarter rather than having assertPred, but it isn't clear that 
assert is going to be made smarter any time soon. I suspect that it will be a 
while before it's done. We'll have to wait and see though.

- Jonathan M Davis

Mar 06 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-03-07 01:20:25 -0500, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 Speaking of which: Now that assertPred has been rejected on the grounds of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

 
 There's an enhancement request for it:
 
 http://d.puremagic.com/issues/show_bug.cgi?id=5547
 
 I have no idea of any work is actually being done on it or not. It hasn't
 actually been assigned to anyone yet, for whatever that's worth. Honestly, it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone
 who is capable of doing it is particularly motivated to do it (though I'm not
 sure that they're _not_ either). It was clear that a number of people wanted
 assert to be smarter rather than having assertPred, but it isn't clear that
 assert is going to be made smarter any time soon. I suspect that it will be a
 while before it's done. We'll have to wait and see though.

I gave it a try even before assertPred was rejected to check 
feasibility, made something in a few hours that should have mostly 
worked, but then realized I've been playing with the wrong assert code. 
There is apparently two code paths for asserts in DMD, one of which I'm 
not sure is used at all, and I took the wrong one to modify. I'll have 
to sort this out and possibly redo all this with the other code path 
(which seems a little more complicated because it relies on a 
per-module generated assert handler for some reason), but this'll have 
to wait until I have more time.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Mar 07 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,

 Speaking of which: Now that assertPred has been rejected on the grounds 
 of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

 There's an enhancement request for it:

 http://d.puremagic.com/issues/show_bug.cgi?id=5547

 I have no idea of any work is actually being done on it or not. It hasn't
 actually been assigned to anyone yet, for whatever that's worth. Honestly, 
 it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure that 
 anyone
 who is capable of doing it is particularly motivated to do it (though I'm 
 not
 sure that they're _not_ either). It was clear that a number of people 
 wanted
 assert to be smarter rather than having assertPred, but it isn't clear 
 that
 assert is going to be made smarter any time soon. I suspect that it will 
 be a
 while before it's done. We'll have to wait and see though.

Yea, that's what I figured, and that's why I was strongly in favor of 
assertPred despite the "promise" of assert improvements.

You're the sole author of assertPred, right? Do you mind if I include it in 
my zlib/libpng-licensed SemiTwist D Tools library ( 
http://www.dsource.org/projects/semitwist ) ? I already have an 
assert-alternative in there, but assertPred is vastly superior. (Although, 
my assert-alternative does save a list of failures instead of immediately 
throwing, which I personally find to be essential for unittests, so I would 
probably add the *optional* ability to have assertPred do the same.)

Mar 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...
 
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...
 
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in
 an improved
 assert,

 
 Speaking of which: Now that assertPred has been rejected on the grounds
 of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

 
 There's an enhancement request for it:
 
 http://d.puremagic.com/issues/show_bug.cgi?id=5547
 
 I have no idea of any work is actually being done on it or not. It hasn't
 actually been assigned to anyone yet, for whatever that's worth.
 Honestly, it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure that
 anyone
 who is capable of doing it is particularly motivated to do it (though I'm
 not
 sure that they're _not_ either). It was clear that a number of people
 wanted
 assert to be smarter rather than having assertPred, but it isn't clear
 that
 assert is going to be made smarter any time soon. I suspect that it will
 be a
 while before it's done. We'll have to wait and see though.

 
 Yea, that's what I figured, and that's why I was strongly in favor of
 assertPred despite the "promise" of assert improvements.
 
 You're the sole author of assertPred, right? Do you mind if I include it in
 my zlib/libpng-licensed SemiTwist D Tools library (
 http://www.dsource.org/projects/semitwist ) ? I already have an
 assert-alternative in there, but assertPred is vastly superior. (Although,
 my assert-alternative does save a list of failures instead of immediately
 throwing, which I personally find to be essential for unittests, so I would
 probably add the *optional* ability to have assertPred do the same.)

Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you can 
use it for whatever Boost lets you do with it, and even if what you're doing 
isn't Boost compatible, it's fine with me if you use it anyway.

I do intend to take some of its functionality which assert will never have
(such 
as assertPred!("opCmp", "<") or assertPred!"opAssign") and make another
proposal 
to add those, but that's going to have to wait until other stuff is reviewed,
and 
it doesn't help with what assert is supposed to be doing anyway (such as 
assert(a == b)).

I would really liked to have gotten assertPred into Phobos, fancy assert or no, 
but too many people just wanted assert to be better and thought that assertPred 
was unnecessary, overcomplicated, and/or overkill.

- Jonathan M Davis

Mar 07 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 I _was_ thinking of putting forward a new proposal which includes 
 the
 unit testing functionality that assertPred had which won't end up in
 an improved
 assert,

 Speaking of which: Now that assertPred has been rejected on the 
 grounds
 of
 an improved assert that doesn't yet exist, what is the current status 
 of
 the improved assert?

 There's an enhancement request for it:

 http://d.puremagic.com/issues/show_bug.cgi?id=5547

 I have no idea of any work is actually being done on it or not. It 
 hasn't
 actually been assigned to anyone yet, for whatever that's worth.
 Honestly, it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure 
 that
 anyone
 who is capable of doing it is particularly motivated to do it (though 
 I'm
 not
 sure that they're _not_ either). It was clear that a number of people
 wanted
 assert to be smarter rather than having assertPred, but it isn't clear
 that
 assert is going to be made smarter any time soon. I suspect that it 
 will
 be a
 while before it's done. We'll have to wait and see though.

 Yea, that's what I figured, and that's why I was strongly in favor of
 assertPred despite the "promise" of assert improvements.

 You're the sole author of assertPred, right? Do you mind if I include it 
 in
 my zlib/libpng-licensed SemiTwist D Tools library (
 http://www.dsource.org/projects/semitwist ) ? I already have an
 assert-alternative in there, but assertPred is vastly superior. 
 (Although,
 my assert-alternative does save a list of failures instead of immediately
 throwing, which I personally find to be essential for unittests, so I 
 would
 probably add the *optional* ability to have assertPred do the same.)

 Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you 
 can
 use it for whatever Boost lets you do with it, and even if what you're 
 doing
 isn't Boost compatible, it's fine with me if you use it anyway.

Thanks.

 I do intend to take some of its functionality which assert will never have 
 (such
 as assertPred!("opCmp", "<") or assertPred!"opAssign") and make another 
 proposal
 to add those, but that's going to have to wait until other stuff is 
 reviewed, and
 it doesn't help with what assert is supposed to be doing anyway (such as
 assert(a == b)).

 I would really liked to have gotten assertPred into Phobos, fancy assert 
 or no,
 but too many people just wanted assert to be better and thought that 
 assertPred
 was unnecessary, overcomplicated, and/or overkill.

Yea. I have a little bit of experience with JUnit/NUnit. Compared to that, 
assertPred is trivial and perfectly straightforward.

Mar 07 2011

"Nick Sabalausky" <a a.a> writes:

"Nick Sabalausky" <a a.a> wrote in message 
news:il3tra$3gg$1 digitalmars.com...
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
 news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:

 Yea, that's what I figured, and that's why I was strongly in favor of
 assertPred despite the "promise" of assert improvements.

 You're the sole author of assertPred, right? Do you mind if I include it 
 in
 my zlib/libpng-licensed SemiTwist D Tools library (
 http://www.dsource.org/projects/semitwist ) ? I already have an
 assert-alternative in there, but assertPred is vastly superior. 
 (Although,
 my assert-alternative does save a list of failures instead of 
 immediately
 throwing, which I personally find to be essential for unittests, so I 
 would
 probably add the *optional* ability to have assertPred do the same.)

 Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so 
 you can
 use it for whatever Boost lets you do with it, and even if what you're 
 doing
 isn't Boost compatible, it's fine with me if you use it anyway.

 Thanks.

I've added it and made an optional 'autoThrow' flag that, if set to false, 
prevents a failure from immediately bailing out of the whole unittest (some 
people like that, like me, and others don't).

http://www.dsource.org/projects/semitwist/changeset?new=%2F%40196&old=%2F%40193

Mar 08 2011

spir <denis.spir gmail.com> writes:

On 03/08/2011 09:25 AM, Nick Sabalausky wrote:
 "Nick Sabalausky"<a a.a>  wrote in message
 news:il3tra$3gg$1 digitalmars.com...
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:

 Yea, that's what I figured, and that's why I was strongly in favor of
 assertPred despite the "promise" of assert improvements.

 You're the sole author of assertPred, right? Do you mind if I include it
 in
 my zlib/libpng-licensed SemiTwist D Tools library (
 http://www.dsource.org/projects/semitwist ) ? I already have an
 assert-alternative in there, but assertPred is vastly superior.
 (Although,
 my assert-alternative does save a list of failures instead of
 immediately
 throwing, which I personally find to be essential for unittests, so I
 would
 probably add the *optional* ability to have assertPred do the same.)

 Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so
 you can
 use it for whatever Boost lets you do with it, and even if what you're
 doing
 isn't Boost compatible, it's fine with me if you use it anyway.

 Thanks.

 I've added it and made an optional 'autoThrow' flag that, if set to false,
 prevents a failure from immediately bailing out of the whole unittest (some
 people like that, like me, and others don't).

 http://www.dsource.org/projects/semitwist/changeset?new=%2F%40196&old=%2F%40193

I like it as well.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 08 2011

"Nick Sabalausky" <a a.a> writes:

"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2341.1299588465.4748.digitalmars-d puremagic.com...
 On 03/08/2011 09:25 AM, Nick Sabalausky wrote:
 "Nick Sabalausky"<a a.a>  wrote in message
 news:il3tra$3gg$1 digitalmars.com...
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:

 Yea, that's what I figured, and that's why I was strongly in favor of
 assertPred despite the "promise" of assert improvements.

 You're the sole author of assertPred, right? Do you mind if I include 
 it
 in
 my zlib/libpng-licensed SemiTwist D Tools library (
 http://www.dsource.org/projects/semitwist ) ? I already have an
 assert-alternative in there, but assertPred is vastly superior.
 (Although,
 my assert-alternative does save a list of failures instead of
 immediately
 throwing, which I personally find to be essential for unittests, so I
 would
 probably add the *optional* ability to have assertPred do the same.)

 Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so
 you can
 use it for whatever Boost lets you do with it, and even if what you're
 doing
 isn't Boost compatible, it's fine with me if you use it anyway.

 Thanks.

 I've added it and made an optional 'autoThrow' flag that, if set to 
 false,
 prevents a failure from immediately bailing out of the whole unittest 
 (some
 people like that, like me, and others don't).

 http://www.dsource.org/projects/semitwist/changeset?new=%2F%40196&old=%2F%40193

 I like it as well.

If you do use it, and have autoThrow set to false, be aware that it doesn't 
*yet* catch exceptions that are thrown from the actual code being tested. 
Ie:

unittest
{
    autoThrow = true; // Ie, the default (unless you use the unittestSection 
mixin)

    // A: AssertError is thrown, not caught and unittest bails out
    assertPred!"a"(false);

    // B: Exception is thrown, not caught and unittest bails out
    assertPred!"throw new Exception()"(10);


    autoThrow = false;

    // C: Error message is displayed, assertCount is incremented, unittest 
continues
    assertPred!"a"(false);

    // D: *Should* do same as C, but currently does same as B
    assertPred!"throw new Exception()"(10);
}

void main()
{
    // If autoThrow is false and there were any failures,
    // then this throws an actual AssertError
    flushAsserts();

    // Rest of main here
}

I plan to fix that though.

Mar 08 2011

spir <denis.spir gmail.com> writes:

On 03/07/2011 07:20 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.


 I'm not sure I'd say the current std.path "basically works", but I get what
 you mean.

 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,

 Speaking of which: Now that assertPred has been rejected on the grounds of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

 There's an enhancement request for it:

 http://d.puremagic.com/issues/show_bug.cgi?id=5547

 I have no idea of any work is actually being done on it or not. It hasn't
 actually been assigned to anyone yet, for whatever that's worth. Honestly, it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone
 who is capable of doing it is particularly motivated to do it (though I'm not
 sure that they're _not_ either). It was clear that a number of people wanted
 assert to be smarter rather than having assertPred, but it isn't clear that
 assert is going to be made smarter any time soon. I suspect that it will be a
 while before it's done. We'll have to wait and see though.

IIUC:
The problem is this feature belongs to the category of things that cannot be 
implemented by any D programmer, in D, as a lib feature, even by an expert in 
the domain.
It needs to get a representation of the unevaluated expression beeing asserted, 
meaning compiler support, meaning hard low-level C/++ and a great knowledge of 
the compiler architecture, esp the construction of the AST. If there was a way 
to "quote" D expressions, and get their representation at runtime, then we 
could do it ourselves (would imply some perf penalty, but I consider this worth 
compared to the terrible expressive power gained, and in fact totally 
neglectible for an assert statement).
Please tell me where I'm wrong.

With the same power, I would implement at once 'varWrite':
	int x = 3; s = square(x);
	varWrite("value: 'x' --> square: 's'");
	// --> "value: 3 --> square: 9"
or even maybe:
	int x = 3;
	varWrite("value: 'x' --> square: 'x*x'");
	// --> "value: 3 --> square: 9"


Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 07 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-07 13:55, spir wrote:
 On 03/07/2011 07:20 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most
 urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.


 I'm not sure I'd say the current std.path "basically works", but I
 get what
 you mean.

 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,

 Speaking of which: Now that assertPred has been rejected on the
 grounds of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

 There's an enhancement request for it:

 http://d.puremagic.com/issues/show_bug.cgi?id=5547

 I have no idea of any work is actually being done on it or not. It hasn't
 actually been assigned to anyone yet, for whatever that's worth.
 Honestly, it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure
 that anyone
 who is capable of doing it is particularly motivated to do it (though
 I'm not
 sure that they're _not_ either). It was clear that a number of people
 wanted
 assert to be smarter rather than having assertPred, but it isn't clear
 that
 assert is going to be made smarter any time soon. I suspect that it
 will be a
 while before it's done. We'll have to wait and see though.

 IIUC:
 The problem is this feature belongs to the category of things that
 cannot be implemented by any D programmer, in D, as a lib feature, even
 by an expert in the domain.
 It needs to get a representation of the unevaluated expression beeing
 asserted, meaning compiler support, meaning hard low-level C/++ and a
 great knowledge of the compiler architecture, esp the construction of
 the AST. If there was a way to "quote" D expressions, and get their
 representation at runtime, then we could do it ourselves (would imply
 some perf penalty, but I consider this worth compared to the terrible
 expressive power gained, and in fact totally neglectible for an assert
 statement).
 Please tell me where I'm wrong.

 With the same power, I would implement at once 'varWrite':
 int x = 3; s = square(x);
 varWrite("value: 'x' --> square: 's'");
 // --> "value: 3 --> square: 9"
 or even maybe:
 int x = 3;
 varWrite("value: 'x' --> square: 'x*x'");
 // --> "value: 3 --> square: 9"


 Denis

String mixins ?

-- 
/Jacob Carlborg

Mar 07 2011

spir <denis.spir gmail.com> writes:

On 03/07/2011 02:36 PM, Jacob Carlborg wrote:
 On 2011-03-07 13:55, spir wrote:
 On 03/07/2011 07:20 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most
 urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.


 I'm not sure I'd say the current std.path "basically works", but I
 get what
 you mean.

 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,

 Speaking of which: Now that assertPred has been rejected on the
 grounds of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?

 There's an enhancement request for it:

 http://d.puremagic.com/issues/show_bug.cgi?id=5547

 I have no idea of any work is actually being done on it or not. It hasn't
 actually been assigned to anyone yet, for whatever that's worth.
 Honestly, it
 wouldn't surprise me if it doesn't happen for a while. I'm not sure
 that anyone
 who is capable of doing it is particularly motivated to do it (though
 I'm not
 sure that they're _not_ either). It was clear that a number of people
 wanted
 assert to be smarter rather than having assertPred, but it isn't clear
 that
 assert is going to be made smarter any time soon. I suspect that it
 will be a
 while before it's done. We'll have to wait and see though.

 IIUC:
 The problem is this feature belongs to the category of things that
 cannot be implemented by any D programmer, in D, as a lib feature, even
 by an expert in the domain.
 It needs to get a representation of the unevaluated expression beeing
 asserted, meaning compiler support, meaning hard low-level C/++ and a
 great knowledge of the compiler architecture, esp the construction of
 the AST. If there was a way to "quote" D expressions, and get their
 representation at runtime, then we could do it ourselves (would imply
 some perf penalty, but I consider this worth compared to the terrible
 expressive power gained, and in fact totally neglectible for an assert
 statement).
 Please tell me where I'm wrong.

 With the same power, I would implement at once 'varWrite':
 int x = 3; s = square(x);
 varWrite("value: 'x' --> square: 's'");
 // --> "value: 3 --> square: 9"
 or even maybe:
 int x = 3;
 varWrite("value: 'x' --> square: 'x*x'");
 // --> "value: 3 --> square: 9"


 Denis

 String mixins ?

Works not, strings must be known at compile-time. And I don't want black magic.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 07 2011

spir <denis.spir gmail.com> writes:

On 03/07/2011 01:44 AM, Jonathan M Davis wrote:
 I think whatever you choose will not please everybody, so just choose
  something and stick with it. Regarding all the extension naming stuff, I
  suggest you go with the "suffix" nomenclature which is more general and
  applicable to all OSs.


 I agree with Lars on this one. Everyone knows what an extension is. It's a
 universal concept even if it's not used as much on non-Windows OSes. There
_are_
 plenty of programs in *nix which use it internally (likely because it's a lot
 easier than dealing with mime type) even if they shouldn't.

eg: numerous compilers, programming editors,... ;-)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 04:31:20 Lars T. Kyllingstad wrote:
 On Sat, 05 Mar 2011 16:32:55 +0000, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 
 Not too often, so I take it as a good sign that I'm onto something. ;)
 
 The only disagreement seems to be about the naming, so let's have a
 round of voting.  Here are a few alternatives for each function.  Please
 say which ones you prefer.
 
  * dirSeparator, dirSep, sep
  * currentDirSymbol, currentDirSym, curDirSymbol * basename, baseName,
  filename, fileName * dirname, dirName, directory, getDir, getDirName *
  drivename, driveName, drive, getDrive, getDriveName * extension, ext,
  getExt, getExtension * stripExtension, stripExt
 
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)

 
 In summary, it seems currentDirSymbol, baseName, dirName and driveName
 are clear winners.  Less clear, but still voted for by the majority, are
 extension and stripExtension.  It is a tie between dirSep and
 dirSeparator.
 
 Below are the votes I counted.  And before you say "hey, I didn't know we
 could make suggestions of our own", or "why did that guy get several
 votes?", this was by no means a formal vote.  It was just trying to get a
 feel for people's preferences.  Before the module gets accepted into
 Phobos there will have to be a formal review process, so there is still a
 lot of opportunity to fight over naming. :)
 
 dirSep: 3 (Nick Sabalausky, spir, Jonathan M. Davis)
 dirSeparator: 3 (Bekenn, Jim, J Chapman)
 
 currDirSym: 1 (Jonathan M. Davis)
 currDirSymbol: 2 (Nick Sabalausky, Jonathan M. Davis)
 path.current: 1 (Andrej Mitrovic)
 currentDirSymbol: 4 (Bekenn, Jim, J Chapman, spir)
 
 baseName: 6 (Nick Sabalausky, Bekenn, Jim, J Chapman, spir, Jonathan M.
 Davis)
 baseFileName: 1 (Nick Sabalausky)
 fileName: 1 (spir)
 basename: 1 (Andrei Alexandrescu)
 
 dirName: 6 (Nick Sabalausky, Bekenn, Jim, spir, Jonathan M. Davis, David
 Nadlinger)
 directory: 1 (Nick Sabalausky)
 getDirName: 2 (J Chapman, spir)
 dirname: 1 (Andrei Alexandrescu)
 
 driveName: 4 (Nick Sabalausky, Bekenn, Jim, spir)
 drive: 2 (Nick Sabalausky, Jonathan M. Davis)
 getDriveName: 2 (J Chapman, spir)
 driveLetter: 1 (Jonathan M. Davis)
 
 ext: 1 (Nick Sabalausky)
 extension: 2 (Bekenn, Jim)
 getExtension: 1 (J Chapman)
 
 stripExt: 2 (Nick Sabalausky, Jonathan M. Davis)
 stripExtension: 3 (Bekenn, Jim, J Chapman)

This is a very small sampling of even the folks here on the newsgroup, let
alone 
the D community at large, so I don't think that you can really base all _that_ 
much off of the votes. Rather, I think that you should pretty much do what
Andrei 
said and pick what you think is best, but now you have some opinions and 
arguments from other people that you can take into consideration when naming
the 
functions. As Andrei said, you're never going to get everyone to agree anyway.

I think that the general guidelines here should be that the names be
descriptive 
but as short as they can reasonably be and still be appropriately descriptive. 
Names which are not descriptive enough are likely to not be clear enough, but 
names that are very descriptive but as very long are likely to get very
annoying 
- especially if you have to use them often and/or have to deal with a character 
limit per line.

So, take what has been said into consideration and adjust the names as you
think 
is appropriate. I'm sure that they'll get debated further when you actually put 
it up for a full review. But naming is arguably _the_ classic bike shedding 
issue. It matters but not in proportion with the amount of discussion and 
arguing that it gets, and you'll _never_ get everyone to agree over it.

On a side note, any functions that have changed behavior should probably have 
names which are different from what's currently in std.path. So, for instance,
if 
your basename function has different behavior from the current std.path's 
basename, you should probably give it a different name (in this case, the
obvious 
solution is baseName - it actually follows Phobos' naming conventions and was 
the pretty clear favorite in this discussion). Otherwise, you're going to break 
code when your code gets merged into Phobos. If the behavioral change is small, 
the perhaps a new name is not necessary, but I know that Walter is _very_ much 
against breaking code with changes to Phobos, and silently changing behavior on 
someone is one of the worst ways to do that. Fortunately, I believe that pretty 
much all of your functions have new names, but that _is_ something to consider 
when naming stuff.

- Jonathan M Davis

Mar 06 2011

"Regan Heath" <regan netmail.co.nz> writes:

dirSep
curDirSymbol
baseName
directory
drive
ext
stripExt

I would actually prefer getDir, getDrive and getExt if there was a  
corresponding getName (instead of baseName).

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 07 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Sat, 05 Mar 2011 16:32:55 -0000, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:
 The only disagreement seems to be about the naming, so let's have a round
 of voting.  Here are a few alternatives for each function.  Please say
 which ones you prefer.

  * dirSeparator, dirSep, sep
  * currentDirSymbol, currentDirSym, curDirSymbol
  * basename, baseName, filename, fileName
  * dirname, dirName, directory, getDir, getDirName
  * drivename, driveName, drive, getDrive, getDriveName
  * extension, ext, getExt, getExtension
  * stripExtension, stripExt

Is it just me that feels dirName and getDirName are ambiguous?

i.e. in the path:
   c:\temp\folder\name\file.ext

There are 3 directories:
  - Their "names" are 'temp', 'folder' and 'name'
  - Their "paths" are c:\temp, c:\temp\folder and c:\temp\folder\name

It's the reason I think baseName is clearer than fileName, with fileName  
you're not sure if it means the complete/full filename including  
directories or just the filename itself, with or without extension.  
baseName (perhaps once you're used to the idea of it) implies the shorter  
form.  In fact.. why not call baseName on directories too, to remove the  
leading path components.

e.g.

getDir("c:\temp\folder\name\file.ext")           -> "c:\temp\folder\name"
baseName(getDir("c:\temp\folder\name\file.ext")) -> "name"

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday 05 March 2011 17:43:50 Andrej Mitrovic wrote:
 I dunno, maybe I'd prefer an enum.
 
 enum path : string { current = ".", up = ".." };
 
 main() { string newPath = join("C:", "Windows", "Subdir", path.up,
 path.up, "Program Files");
 newPath == r"C:\Windows\Subdir\..\..\Program Files";
 
 This is just nitpicking however. And 'current' is only used on Linux afaik?
 :)

I have no idea what's used on Windows. I rarely use it these days.

- Jonathan M Davis

Mar 05 2011

Adam Ruppe <destructionator gmail.com> writes:

current == "." on Windows too.

Mar 05 2011

Rainer Schuetze <r.sagitario gmx.de> writes:

Looks good overall. I have a few comments and nitpicks though:

   basename("dir/subdir/")             -->  "subdir"
   directory("dir/subdir/")      -->  "dir"

Is this what everybody expects? I'm not sure, but another possibility 
would be to treat these as if "dir/subdir/." is passed. What is the 
result of directory("/") or directory("d:/")?

   extension("file")               -->  ""
   extension("file.ext")           -->  "ext"

What about "file."? I tried it on NTFS, but trailing '.' seems to always 
be cut off. Is it possible to create such a file on unix systems? If 
yes, you won't be able to recreate it from the result of basename() and 
extension().

What about network shares like "\\server\share\dir\file"? Maybe it 
should also be shown in the examples? Does the "\\server" part need 
special consideration?

Rainer

Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
 started working on a rewrite of std.path a long time ago, but I got 
 sidetracked by other things.  The recent discussion got me working on it 
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of 
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are 
 toAbsolute() and toCanonical, because they rely on std.file.getcwd() 
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current 
 std.path.  See the other thread for some examples, or take a look at the 
 unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
 
 -Lars

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 00:37:15 Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?

How about

baseName("dir/subdir/") -->  "subdir/"
dirName("dir/subdir/")   -->  "dir"

There _are_ programs (such as rsync) which care about whether a / is included
at 
the end of the path. Doing that should also deal with the "/" and "d:/" issue. 
So, I can see why Lars would have made the base name of "dir/subdir" be
"subdir" 
instead of "subdir/" (I don't know whether that's the current behavior or not, 
so he may just have copied it from what's currently there), but It seems to me 
that it will be more consistent to truet "subdir/" as the base name of 
"dir/subdir". Unfortunately, sometimes there _is_ a difference between "subdir" 
and "subdir/".

  >   extension("file")               -->  ""
  >   extension("file.ext")           -->  "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().

*nix doesn't really do anything special with any file names. The closest is
files 
which start with "." - most programs consider those to be hidden and don't show 
them. There's definitely no problem with using "file." as a file name. This is 
probably a good argument for putting the "." back in the extension like it was 
before.

 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?

Probably, unfortunately. \\ is kind of like a drive letter, so it really should 
be special cased, I think.

- Jonathan M Davis

Mar 06 2011

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
=20
   basename("dir/subdir/")             -->  "subdir"
   directory("dir/subdir/")      -->  "dir"

=20

	I would say:
basename ("dir/subdir/") -> "" (or ".")
dirname  ("dir/subdir/") -> "dir/subdir"
basename ("dir/subdir")  -> "subdir"
dirname  ("dir/subdir")  -> "dir"

	Same as Python does.

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?
=20
   extension("file")               -->  ""
   extension("file.ext")           -->  "ext"

=20

extension ("file")     -> ""
extension ("file.ext") -> ".ext"
extension ("file.")    -> "."

 What about "file."? I tried it on NTFS, but trailing '.' seems to alway=

s
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and=

 extension().
=20

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/06/2011 12:50 PM, "Jérôme M. Berger" wrote:
 Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:

    basename("dir/subdir/")             -->   "subdir"
    directory("dir/subdir/")      -->   "dir"


 	I would say:
 basename ("dir/subdir/") ->  "" (or ".")
 dirname  ("dir/subdir/") ->  "dir/subdir"
 basename ("dir/subdir")  ->  "subdir"
 dirname  ("dir/subdir")  ->  "dir"

 	Same as Python does.

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?

    extension("file")               -->   ""
    extension("file.ext")           -->   "ext"


 extension ("file")     ->  ""
 extension ("file.ext") ->  ".ext"
 extension ("file.")    ->  "."

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().


This solves the issue of recomposing a file path/name from its parts. But it's 
not what people mean, expect, and need with the notion of extension. We would 
have to remember this (weird) behaviour of the extension() function; and 
systematically write strip off starting '.'. Then, we get caught when the 
result is ""! Thus, we must add a check:
	extension = path.extension(foo);
	if (extension[0] == '.')
	    extension = extension[1..$];
Very nice...

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.

I don't know about everybody, but it is what *NIX users expect, at 
least.  I have written those functions so they adhere to the POSIX 
requirements for the 'basename' and 'dirname' commands.

 What is the
 result of directory("/") or directory("d:/")?

"/" and "d:/", respectively.  The first is what 'dirname' prints, and the 
second is the natural extension to Windows paths.  (I believe I have 
covered most corner cases in the unittests.  I think it would just be 
confusing to add all of them to the documentation.)

  >   extension("file")               -->  "" extension("file.ext")      
  >       -->  "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().

Good point.  I don't know if there is any kind of precedent here.  What 
do others think?

 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?

Hmm.. that's another good point.  I haven't even though of those, but 
they should probably be covered as well.  I'll look into it.

-Lars

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 03:56:53 Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.

 I don't know about everybody, but it is what *NIX users expect, at
 least.  I have written those functions so they adhere to the POSIX
 requirements for the 'basename' and 'dirname' commands.

If there's a standard way to deal with that, then that's probably best.

 What is the
 result of directory("/") or directory("d:/")?

 "/" and "d:/", respectively.  The first is what 'dirname' prints, and the
 second is the natural extension to Windows paths.  (I believe I have
 covered most corner cases in the unittests.  I think it would just be
 confusing to add all of them to the documentation.)

  >   extension("file")               -->  "" extension("file.ext")
  >   
  >       -->  "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().

 Good point.  I don't know if there is any kind of precedent here.  What
 do others think?

I kind of like how your extension doesn't include the "." in it, since you'd 
often want to remove it anyway, but given this particular ambiguity, I think 
that it's probably better to go with the old way of including the "." in the 
extension.

- Jonathan M Davis

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikvsq5$1qr9$2 digitalmars.com...
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.

 I don't know about everybody, but it is what *NIX users expect, at
 least.  I have written those functions so they adhere to the POSIX
 requirements for the 'basename' and 'dirname' commands.

I initially felt somewhat uncomfortable with the idea of that behavior, but 
then I realized two things:

1. You don't have to constantly worry about "trailing slash" vs "no trailing 
slash" and remember the different semantics. (The "trailing slash" vs "no 
trailing slash" matter can be a real pain.)

2. It'll always treat a path to a directory the same way as a path to a 
file. (Consistency is nice. Especially since you don't always know if 
something is intended to be a file or directory.)

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/06/2011 12:56 PM, Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

   >    basename("dir/subdir/")             -->   "subdir"
   >    directory("dir/subdir/")      -->   "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.

 I don't know about everybody, but it is what *NIX users expect, at
 least.  I have written those functions so they adhere to the POSIX
 requirements for the 'basename' and 'dirname' commands.

 What is the
 result of directory("/") or directory("d:/")?

 "/" and "d:/", respectively.  The first is what 'dirname' prints, and the
 second is the natural extension to Windows paths.  (I believe I have
 covered most corner cases in the unittests.  I think it would just be
 confusing to add all of them to the documentation.)

   >    extension("file")               -->   "" extension("file.ext")
   >        -->   "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().

 Good point.  I don't know if there is any kind of precedent here.  What
 do others think?

 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?

 Hmm.. that's another good point.  I haven't even though of those, but
 they should probably be covered as well.  I'll look into it.

What about extending the notion of 'device' (see other post) to cover 'http://' 
and "ftp://"?
Would it be complicated?

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:

 On 03/06/2011 12:56 PM, Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

   >    basename("dir/subdir/")             -->   "subdir"
   >    directory("dir/subdir/")      -->   "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.

 I don't know about everybody, but it is what *NIX users expect, at
 least.  I have written those functions so they adhere to the POSIX
 requirements for the 'basename' and 'dirname' commands.

 What is the
 result of directory("/") or directory("d:/")?

 "/" and "d:/", respectively.  The first is what 'dirname' prints, and
 the second is the natural extension to Windows paths.  (I believe I
 have covered most corner cases in the unittests.  I think it would just
 be confusing to add all of them to the documentation.)

   >    extension("file")               -->   "" extension("file.ext")
   >        -->   "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to
 always be cut off. Is it possible to create such a file on unix
 systems? If yes, you won't be able to recreate it from the result of
 basename() and extension().

 Good point.  I don't know if there is any kind of precedent here.  What
 do others think?

 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?

 Hmm.. that's another good point.  I haven't even though of those, but
 they should probably be covered as well.  I'll look into it.

 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

I don't think std.path should handle general URIs.  It should only have 
to deal with the kind of paths you can pass to the functions in std.file 
and std.stdio.

-Lars

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

 I don't think std.path should handle general URIs.  It should only have
 to deal with the kind of paths you can pass to the functions in std.file
 and std.stdio.

If std.path doesn't handle uri's, then we'd need a whole other set of 
functions for dealing with uris. And at least a few of the functions would 
overlap. And then people who want to be able to handle both files and uris 
will want functions that will seamlessly handle either. So I think it really 
would be best to just bite the bullet and have std.path handle uri's.

That said, I'm not sure this would be necessary for round 1 of the new 
std.path. Could just be added later.

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 13:49:59 Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

 
 I don't think std.path should handle general URIs.  It should only have
 to deal with the kind of paths you can pass to the functions in std.file
 and std.stdio.

 
 If std.path doesn't handle uri's, then we'd need a whole other set of
 functions for dealing with uris. And at least a few of the functions would
 overlap. And then people who want to be able to handle both files and uris
 will want functions that will seamlessly handle either. So I think it
 really would be best to just bite the bullet and have std.path handle
 uri's.
 
 That said, I'm not sure this would be necessary for round 1 of the new
 std.path. Could just be added later.

We do have std.uri, though it's pretty bare-boned at the moment.

- Jonathan M Davis

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/06/2011 10:49 PM, Nick Sabalausky wrote:
 "Lars T. Kyllingstad"<public kyllingen.NOSPAMnet>  wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

 I don't think std.path should handle general URIs.  It should only have
 to deal with the kind of paths you can pass to the functions in std.file
 and std.stdio.

 If std.path doesn't handle uri's, then we'd need a whole other set of
 functions for dealing with uris. And at least a few of the functions would
 overlap. And then people who want to be able to handle both files and uris
 will want functions that will seamlessly handle either. So I think it really
 would be best to just bite the bullet and have std.path handle uri's.

 That said, I'm not sure this would be necessary for round 1 of the new
 std.path. Could just be added later.

Right, but if there is reasonable probability for such an extension, then we 
must think at it, so-to-say "at design time". Else, various common issues will 
raise barriers on the way of extension (existing codebase, detail conflicts, 
refactoring requirements... naming! ;-) (*)
Then, once such work is on good way, possibly implementation is no more such a 
big deal. Or, conversely, we may feel the need for prototyping and trials to 
construct and/or validate a big picture design. Etc...
To sum up: since there is no emergency (--> Andrei's last post), we have a very 
good base thank to Lars's well-thought job, and there are already a number of 
people involved in the discussion -- why not?

Denis

(*) drive name --> ?
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 06 Mar 2011 16:49:59 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

 I don't think std.path should handle general URIs.  It should only have
 to deal with the kind of paths you can pass to the functions in
 std.file and std.stdio.

 If std.path doesn't handle uri's, then we'd need a whole other set of
 functions for dealing with uris. And at least a few of the functions
 would overlap. And then people who want to be able to handle both files
 and uris will want functions that will seamlessly handle either. So I
 think it really would be best to just bite the bullet and have std.path
 handle uri's.

I am now certain that std.path should not give URIs any kind of special 
treatment, for the simple reason that most URIs are also valid paths on 
POSIX.  Specifically, file and directory names may contain the ':' 
character, and multiple consecutive slashes are treated as a single 
slash.  In other words, you can do this:

  mkdir http:
  mkdir http://www.digitalmars.com
  cd http://www.digitalmars.com

That means std.path should treat "http:" as just another path component, 
and it should treat "//" on equal footing with "/".  This is how it's 
done now, and it is how it should be.

-Lars

Mar 07 2011

Jim <bitcirkel yahoo.com> writes:

Lars T. Kyllingstad Wrote:

 On Sun, 06 Mar 2011 16:49:59 -0500, Nick Sabalausky wrote:
 
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

 I don't think std.path should handle general URIs.  It should only have
 to deal with the kind of paths you can pass to the functions in
 std.file and std.stdio.

 If std.path doesn't handle uri's, then we'd need a whole other set of
 functions for dealing with uris. And at least a few of the functions
 would overlap. And then people who want to be able to handle both files
 and uris will want functions that will seamlessly handle either. So I
 think it really would be best to just bite the bullet and have std.path
 handle uri's.

 
 I am now certain that std.path should not give URIs any kind of special 
 treatment, for the simple reason that most URIs are also valid paths on 
 POSIX.  Specifically, file and directory names may contain the ':' 
 character, and multiple consecutive slashes are treated as a single 
 slash.  In other words, you can do this:
 
   mkdir http:
   mkdir http://www.digitalmars.com
   cd http://www.digitalmars.com
 
 That means std.path should treat "http:" as just another path component, 
 and it should treat "//" on equal footing with "/".  This is how it's 
 done now, and it is how it should be.
 
 -Lars


Not quite sure it would be that easy.
http://en.wikipedia.org/wiki/URI_scheme

Mar 07 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:il28cm$2phc$1 digitalmars.com...
 On Sun, 06 Mar 2011 16:49:59 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?

 I don't think std.path should handle general URIs.  It should only have
 to deal with the kind of paths you can pass to the functions in
 std.file and std.stdio.

 If std.path doesn't handle uri's, then we'd need a whole other set of
 functions for dealing with uris. And at least a few of the functions
 would overlap. And then people who want to be able to handle both files
 and uris will want functions that will seamlessly handle either. So I
 think it really would be best to just bite the bullet and have std.path
 handle uri's.

 I am now certain that std.path should not give URIs any kind of special
 treatment, for the simple reason that most URIs are also valid paths on
 POSIX.  Specifically, file and directory names may contain the ':'
 character, and multiple consecutive slashes are treated as a single
 slash.  In other words, you can do this:

  mkdir http:
  mkdir http://www.digitalmars.com
  cd http://www.digitalmars.com

 That means std.path should treat "http:" as just another path component,
 and it should treat "//" on equal footing with "/".  This is how it's
 done now, and it is how it should be.

I really wish that wasn't such a good argument. I'm now convinced too, 
albiet reluctantly.

Like anyone else, I certainly beleive that MS has made a number of bad calls 
about certain things. But this is once case where I actually wish unix 
worked the windows way: If unix weren't so permissive about filename chars, 
then we wouldn't have such ambiguities. Oh well. At least URI's have the 
file:/// protocol, so at least you can treat local and remote the same if 
you assume everything to be interpreted as a URI. I just wish it were 
possible to actually *detect* URI vs filepath outside of windows.

Mar 07 2011

Rainer Schuetze <r.sagitario gmx.de> writes:

Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:
 
 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().

 
 Good point.  I don't know if there is any kind of precedent here.  What 
 do others think?
 

Maybe special casing similar to the "hidden" files starting with '.':

basename("file.") --> "file."
extension("file.") --> ""

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/06/2011 04:41 PM, Rainer Schuetze wrote:
 Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().

 Good point. I don't know if there is any kind of precedent here. What do
 others think?

 Maybe special casing similar to the "hidden" files starting with '.':

 basename("file.") --> "file."
 extension("file.") --> ""

I agrre, and this is probably the correct solution: if there is nothing after 
the dot, then it's not an extension separator, thus it's part of the baseName 
(just like if there is nothing before the dot).

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/06/2011 09:37 AM, Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:

I think all your questions are sensible, Rainer.

    basename("dir/subdir/")             -->  "subdir"
    directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility would be
 to treat these as if "dir/subdir/." is passed. What is the result of
 directory("/") or directory("d:/")?

Depends. We must make clear whether such funcs work:
1. indifferently for file and dir names, in which case we get the above results,
2. differently for file & dir names, in which case we would have "dir/subdir/" 
as result of both operations above,
3. only for file names, in which case we throw an error when these functions 
are called on dir names.

I find both solutions 1. and 2. conceptually problematic; the second one only a 
bit less. Maybe the only sensible choice is 3.?

    extension("file")               -->  ""
    extension("file.ext")           -->  "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to always be cut
 off. Is it possible to create such a file on unix systems? If yes, you won't be
 able to recreate it from the result of basename() and extension().

This is /really/ problematic, indeed! The splitting operation *must* be 
reversable in all cases. In other other words, file name/path recomposition 
must be symmetric of splitting it.

 What about network shares like "\\server\share\dir\file"? Maybe it should also
 be shown in the examples? Does the "\\server" part need special consideration?

I think there should be a special case similar to windows drive names. Maybe, 
instead of a notion of drive, have a notion of 'device', which could then cover 
network connexion. Then, a full file path/name would be composed of:
	deviceName | dirName || baseName | extension
One issue is defining the appropriate 'joint'/sep between deviceName & dirName. 
(See split <--> recomposition above.)

What do you think?


Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

=?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:

spir wrote:
 On 03/06/2011 09:37 AM, Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:

=20
 I think all your questions are sensible, Rainer.
=20
    basename("dir/subdir/")             -->  "subdir"
    directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be
 to treat these as if "dir/subdir/." is passed. What is the result of
 directory("/") or directory("d:/")?

=20
 Depends. We must make clear whether such funcs work:
 1. indifferently for file and dir names, in which case we get the above=

 results,
 2. differently for file & dir names, in which case we would have
 "dir/subdir/" as result of both operations above,
 3. only for file names, in which case we throw an error when these
 functions are called on dir names.
=20
 I find both solutions 1. and 2. conceptually problematic; the second on=

e
 only a bit less. Maybe the only sensible choice is 3.?
=20

	This does not make sense because there is no way to tell whether
"foo/bar" is intended as a file name or a dir name. IMO the only
sensible thing to do is to split on the last path separator:
everything to the right is the base name (or everything if there is
no separator) and everything to the left is the dir name. This has
the two very important advantages:

- It is a simple rule, so is easy to remember;

- It does not need the path to exists and it does not need to know
whether the path is intended as a file or dir.

		Jerome
--=20
mailto:jeberger free.fr
http://jeberger.free.fr
Jabber: jeberger jabber.fr

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

""J�r�me M. Berger"" <jeberger free.fr> wrote in message 
news:il0f04$2ts8$1 digitalmars.com...
 This does not make sense because there is no way to tell whether
"foo/bar" is intended as a file name or a dir name. IMO the only
sensible thing to do is to split on the last path separator:
everything to the right is the base name (or everything if there is
no separator) and everything to the left is the dir name. This has
the two very important advantages:

- It is a simple rule, so is easy to remember;`

But it doesn't have simple consequences. If I'm trying to refer to a 
particular directory there's a good chance it could be either "/foo/bar" or 
"/foo/bar/" (and the latter is *not* typically thought of as a shorthand for 
"/foo/bar/."). Those are conceptually the *exact same thing*, but with the 
"last slash" rule you suggest, they have wildy different effects when passed 
to certain std.path functions. Most notably, if it's a path with a trailing 
slash, then dirName **no longer returns the directory that *contains* the 
element specified**. It just returns the element itself *instead* of its 
containing directory.

So, since certain functions would have notably different effects with and 
without a trailing slash, and the trailing slash may or may not have been 
given (since the two styles are typically thought of as interchangable), 
every time you call a std.path functions the "last slash" rule would force 
you to go through these steps:

1. Remember if the function you're using is one that's affected.

2. If so, decide which semantics you want.

3. Detect if the "trailing-slashness" of your string matches the semantics 
you want. Which may, in fact, be impossible: If the semantics you desire 
dictate a trailing slash on directories, and your string lacks a trailing 
slash then the *only* way to proceed correctly is to know whether it's 
intended to be a file or a directory, and you don't always know.

4. Coerce your string to match the desired semantics, if possible.

5. Finally call the dammed function.

- It does not need the path to exists and it does not need to know
whether the path is intended as a file or dir.

As I described above, it will sometimes need to know.

Alternatively, the current behavior of Lars's proposed std.path is, to the 
human mind, an equally simple rule and therefore equally simple to remember: 
That last element is the baseName and all the elements before it are the 
dirName.

In contrast to the "last slash" rule, this "last element" rule behaves 
exactly the same regardless of whether a trailing slash was appended or 
omitted and *actually* never needs to know if the path is intended as a file 
or dir. So the five steps above get condensed down to one:

1. Just call the dammed function.

I'll admit, the "last element" behavior of Lars's proposed std.path did 
raise a small red flag to me at first. But the more I think about it, the 
more I think it's the best way to go.

Mar 06 2011

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sun, 06 Mar 2011 10:37:15 +0200, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 What about "file."? I tried it on NTFS, but trailing '.' seems to always  
 be cut off.

It's possible to create files and directories with one trailing dot on  
Windows/NTFS. FAR Manager allows doing this, for example. I'm not sure if  
the implementation does anything special to achieve this, but it's not  
impossible. (Ditto with leading and trailing spaces.)

By the way, not sure if it's been mentioned in this discussion but:

".exe" is an executable file with no name. It's perfectly valid.

-- 
Best regards,
  Vladimir                            mailto:vladimir thecybershadow.net

Mar 06 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/6/11, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 ".exe" is an executable file with no name. It's perfectly valid.

Although for some reason Explorer never lets you do that. Well, I have
a hotkey for creating filenames so I just let autohotkey create the
file such as ".file". Doing it in explorer via rename gets: "You must
type a file name".

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

"Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message 
news:op.vrxw6dmltuzx1w cybershadow.mshome.net...
 On Sun, 06 Mar 2011 10:37:15 +0200, Rainer Schuetze <r.sagitario gmx.de> 
 wrote:

 What about "file."? I tried it on NTFS, but trailing '.' seems to always 
 be cut off.

 It's possible to create files and directories with one trailing dot on 
 Windows/NTFS. FAR Manager allows doing this, for example. I'm not sure if 
 the implementation does anything special to achieve this, but it's not 
 impossible. (Ditto with leading and trailing spaces.)

 By the way, not sure if it's been mentioned in this discussion but:

 ".exe" is an executable file with no name. It's perfectly valid.

It ain't valid when optlink creates it ;)

Mar 06 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Sun, 06 Mar 2011 08:37:15 -0000, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility  
 would be to treat these as if "dir/subdir/." is passed. What is the  
 result of directory("/") or directory("d:/")?

?? I would expect:

   directory("dir/subdir/")      -->  "dir/subdir"

as subdir _is_ a dir, not a file, as shown by the trailing slash.  If it  
was:

   directory("dir/subdir")      -->  "dir"

as subdir is perhaps not a directory, as there is no trailing slash.

I realise this means the trailing slash becomes important, but it kinda is  
important as it does tell us when something is definitely a directory.

Alternately, we could ignore the distinction between file and directory -  
as we're essentially just parsing strings here - and have two functions:

lastComponent("dir/subdir/")  -> "subdir"
lastComponent("dir/subdir")   -> "subdir"

allButLastComponent("dir/subdir/") -> "dir/"
allButLastComponent("dir/subdir")  -> "dir/"

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 07 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Mon, 07 Mar 2011 10:25:21 +0000, Regan Heath wrote:

 On Sun, 06 Mar 2011 08:37:15 -0000, Rainer Schuetze <r.sagitario gmx.de>
 wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?

 ?? I would expect:

    directory("dir/subdir/")      -->  "dir/subdir"

 as subdir _is_ a dir, not a file, as shown by the trailing slash.  If it
 was:

    directory("dir/subdir")      -->  "dir"

 as subdir is perhaps not a directory, as there is no trailing slash.

 I realise this means the trailing slash becomes important, but it kinda
 is important as it does tell us when something is definitely a
 directory.

I don't think it does, or rather, I don't think there is such a thing as 
"definitely a directory".  What about a symlink to a directory, for 
instance?  On one hand, it *is* a file that contains a reference to a 
directory, and on the other, in most respects it *acts like* a directory.

You can even argue that a "file" is simply the term used for a node in 
the filesystem tree, and that "directory" is a special kind of file that 
contains a list of other files.  This terminology is pretty standard in 
*NIX land, at least.  (Just google "everything is a file".)

 Alternately, we could ignore the distinction between file and directory
 - as we're essentially just parsing strings here - and have two
 functions:

 lastComponent("dir/subdir/")  -> "subdir" lastComponent("dir/subdir")  
 -> "subdir"

 allButLastComponent("dir/subdir/") -> "dir/"
 allButLastComponent("dir/subdir")  -> "dir/"

That's how it's done now, and how I think it should be.  The two paths 
"dir/subdir" and "dir/subdir/" both refer to the same object in the file 
system, namely "subdir".  baseName gives you the name of the object 
referred to by a path, while dirName gives you the directory containing 
said object.  Whether that object is a file or a directory is 
irrelevant.  (And if you need to know what it is, there is always 
std.file.isDir and isFile.)

-Lars

Mar 07 2011

spir <denis.spir gmail.com> writes:

On 03/07/2011 01:08 PM, Lars T. Kyllingstad wrote:
 Alternately, we could ignore the distinction between file and directory
  - as we're essentially just parsing strings here - and have two
  functions:

  lastComponent("dir/subdir/")  ->  "subdir" lastComponent("dir/subdir")
  ->  "subdir"

  allButLastComponent("dir/subdir/") ->  "dir/"
  allButLastComponent("dir/subdir")  ->  "dir/"


 That's how it's done now, and how I think it should be.  The two paths
 "dir/subdir" and "dir/subdir/" both refer to the same object in the file
 system, namely "subdir".  baseName gives you the name of the object
 referred to by a path, while dirName gives you the directory containing
 said object.  Whether that object is a file or a directory is
 irrelevant.  (And if you need to know what it is, there is always
 std.file.isDir and isFile.)

After some more thought, I think you are right on this point. Precisely because 
of possible trailing '/'. If OSes were clearer and more consistent, then we 
could and certainly should make a useful semantic distinction.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 07 2011

"Nick Sabalausky" <a a.a> writes:

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:il2hsp$89d$2 digitalmars.com...
 On Mon, 07 Mar 2011 10:25:21 +0000, Regan Heath wrote:

 On Sun, 06 Mar 2011 08:37:15 -0000, Rainer Schuetze <r.sagitario gmx.de>
 wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?

 ?? I would expect:

    directory("dir/subdir/")      -->  "dir/subdir"

 as subdir _is_ a dir, not a file, as shown by the trailing slash.  If it
 was:

    directory("dir/subdir")      -->  "dir"

 as subdir is perhaps not a directory, as there is no trailing slash.

 I realise this means the trailing slash becomes important, but it kinda
 is important as it does tell us when something is definitely a
 directory.

 I don't think it does, or rather, I don't think there is such a thing as
 "definitely a directory".  What about a symlink to a directory, for
 instance?  On one hand, it *is* a file that contains a reference to a
 directory, and on the other, in most respects it *acts like* a directory.

 You can even argue that a "file" is simply the term used for a node in
 the filesystem tree, and that "directory" is a special kind of file that
 contains a list of other files.  This terminology is pretty standard in
 *NIX land, at least.  (Just google "everything is a file".)

That's true on windows too:

"Note that a directory is simply a file with a special attribute designating 
it as a directory..."
http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx#file_and_directory_names

Mar 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday 03 March 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)

I hate to be nitpicky, but I notice that you're the only author listed for this 
module. The current std.path has several authors - none of which are you. So, 
unless you rewrote all of the code from scratch (which you may have done), you 
really should put the other names on it too (though if you rewrote it
thoroughly 
enough, they may have very little left in it that they did; unfortunately, 
without knowing who wrote what, you need to put all of their names on it if any 
of the original code is there).

- Jonathan M Davis

Mar 06 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 06 Mar 2011 01:21:56 -0800, Jonathan M Davis wrote:

 On Thursday 03 March 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)

 
 I hate to be nitpicky, but I notice that you're the only author listed
 for this module. The current std.path has several authors - none of
 which are you. So, unless you rewrote all of the code from scratch
 (which you may have done), you really should put the other names on it
 too (though if you rewrote it thoroughly enough, they may have very
 little left in it that they did; unfortunately, without knowing who
 wrote what, you need to put all of their names on it if any of the
 original code is there).

Everything you see in that module is completely rewritten from scratch.  
I started out by trying to make changes to the original std.path, but 
quickly found that I had to change so much it was better to start with a 
clean slate. 

As long as the module is a part of my own library, and doesn't contain 
anyone else's code, I'll only put my name on it.  When it gets included 
in Phobos, and I add the remaining functions (fcmp, fnmatch, fncharmatch 
and expandTilde), I will of course be sure to list all authors.

-Lars

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 03:36:50 Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 01:21:56 -0800, Jonathan M Davis wrote:
 On Thursday 03 March 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)

 
 I hate to be nitpicky, but I notice that you're the only author listed
 for this module. The current std.path has several authors - none of
 which are you. So, unless you rewrote all of the code from scratch
 (which you may have done), you really should put the other names on it
 too (though if you rewrote it thoroughly enough, they may have very
 little left in it that they did; unfortunately, without knowing who
 wrote what, you need to put all of their names on it if any of the
 original code is there).

 
 Everything you see in that module is completely rewritten from scratch.
 I started out by trying to make changes to the original std.path, but
 quickly found that I had to change so much it was better to start with a
 clean slate.
 
 As long as the module is a part of my own library, and doesn't contain
 anyone else's code, I'll only put my name on it.  When it gets included
 in Phobos, and I add the remaining functions (fcmp, fnmatch, fncharmatch
 and expandTilde), I will of course be sure to list all authors.

That makes sense. It's just that if you didn't rewrite it from scratch, the 
previous authors would need to be there, and we don't want to mess up on 
copyright notices, since that could conveivably cause problems at some point if 
we do mess them up.

- Jonathan M Davis

Mar 06 2011

Jim <bitcirkel yahoo.com> writes:

Lars T. Kyllingstad Wrote:

 On Sat, 05 Mar 2011 14:33:07 -0800, Jonathan M Davis wrote:
 
 On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread,
 I started working on a rewrite of std.path a long time ago, but I
 got sidetracked by other things.  The recent discussion got me
 working on it again, and it turned out there wasn't that much left
 to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I don't want to jinx it, but there seems to be a lot of agreement in
 this thread. Seriously, how often does that happen around here? :)

 
 Not too often, so I take it as a good sign that I'm onto something. ;)
 
 The only disagreement seems to be about the naming, so let's have a
 round of voting.  Here are a few alternatives for each function. 
 Please say which ones you prefer.
 
  * dirSeparator, dirSep, sep

 
 dirSep and pathSep. Having Separator in the name is unnecessarily long.
 
  * currentDirSymbol, currentDirSym, curDirSymbol

 
 currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if
 abbreviating both current and symbol is too much). Shorter but still
 quite clear.
 
 I would _definitely_ use two r's when abbreviating current though, since
 current has two r's. I confess that it' a major pet peeve of mine when I
 see current abbreviate with one r. It feels like it's being spelled
 wrong, since current has two r's.
 
  * basename, baseName, filename, fileName

 
 baseName
 
  * dirname, dirName, directory, getDir, getDirName

 
 dirName
 
  * drivename, driveName, drive, getDrive, getDriveName

 
 driveLetter would probably be better actually - though it _could_ be
 more than one letter if someone has an insane number of drives (it's
 usually referred to as a drive letter though). Barring that, drive would
 be fine (as long as it's a property).

 
 Interestingly, it seems drive names are actually restricted to one 
 letter.  See the last paragraph of this section:
 
 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments
 
 -Lars

Drive names in AmigaOS are longer by default iirc. Anyway, Microsoft might
someday depart from the idea of drive letters. Whether they might support
longer names or just abandon drive identifiers altogether is of course
insolubly unknown, but I think names are at least a more general concept than
letters (so as not to lock ourselves in with a passing coherence).

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 16:54:41 spir wrote:
 On 03/07/2011 01:44 AM, Jonathan M Davis wrote:
 I think whatever you choose will not please everybody, so just choose
 
  something and stick with it. Regarding all the extension naming
  stuff, I suggest you go with the "suffix" nomenclature which is more
  general and applicable to all OSs.


 
 I agree with Lars on this one. Everyone knows what an extension is. It's
 a universal concept even if it's not used as much on non-Windows OSes.
 There _are_ plenty of programs in *nix which use it internally (likely
 because it's a lot easier than dealing with mime type) even if they
 shouldn't.

 
 eg: numerous compilers, programming editors,... ;-)

The one that really bit me IIRC was Audacious. I had some newly ripped music 
files which it wouldn't play.  As it turns out, the problem was that I had had
to 
redo the settings on my ripping program shortly before, and I had forgotten to 
put the extension in the file name, so the newly ripped files had no
extensions, 
and Audiacious apparently used the extension to determine whether it could play 
a particular file. So, of course, it wouldn't play my files, since they had no 
extensions. Unfortunately, it took me quite a while to figure that out, and I 
ended up on a bit of a wild goose chase in the interim...

This reminds me. I should look into mime types one of these days to see what
the 
appropriate way (if any) would be to put support for them in Phobos. It would
be 
nice to not have to go by extension for the few programs that I have which have 
to worry about file type.

- Jonathan M Davis

Mar 06 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 This reminds me. I should look into mime types one of these days to see 
 what the
 appropriate way (if any) would be to put support for them in Phobos. It 
 would be
 nice to not have to go by extension for the few programs that I have which 
 have
 to worry about file type.

I'm no unix expert, but my understanding is that mime types in the 
filesystem don't even exist at all, and that what it *really* does is use 
some complex black-box-ish algorithm that takes into account the first few 
bytes of the file, the extention, the exec flag, and god-knows-what-else to 
determine what type of file it is. Contrary to how people keep making it 
sound, mime type is *not* the determining factor (and cannot possibly be), 
but rather nothing more than the way the *result* of all that analysis is 
represented.

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 
 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.

 
 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really* does is use
 some complex black-box-ish algorithm that takes into account the first few
 bytes of the file, the extention, the exec flag, and god-knows-what-else to
 determine what type of file it is. Contrary to how people keep making it
 sound, mime type is *not* the determining factor (and cannot possibly be),
 but rather nothing more than the way the *result* of all that analysis is
 represented.

I thought that the first few bytes of the file _were_ the mime type. Certainly, 
from what I've seen, extension has _no_ effect on most programs. Konqueror 
certainly acts like it does everything by mime type - file associations are set 
that way.

- Jonathan M Davis

Mar 06 2011

Christopher Nicholson-Sauls <ibisbasenji gmail.com> writes:

On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...

 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.

 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really* does is use
 some complex black-box-ish algorithm that takes into account the first few
 bytes of the file, the extention, the exec flag, and god-knows-what-else to
 determine what type of file it is. Contrary to how people keep making it
 sound, mime type is *not* the determining factor (and cannot possibly be),
 but rather nothing more than the way the *result* of all that analysis is
 represented.

 
 I thought that the first few bytes of the file _were_ the mime type.
Certainly, 
 from what I've seen, extension has _no_ effect on most programs. Konqueror 
 certainly acts like it does everything by mime type - file associations are
set 
 that way.
 
 - Jonathan M Davis

As someone who uses hex editors quite a bit (resorting these days to
using Okteta mainly), I can tell you I have yet to see any file's mime
embedded at the beginning, nor have I seen it in any headers/nodes when
scanning raw.  Doesn't mean it's impossible of course, and certain file
systems certainly might do this[1] but I haven't seen it yet[2].

You are quite right, though, that extension doesn't matter at all,
except in certain corner cases.  Even then, they are reasonable and
predictable things -- like SO's having the right extension.  Considering
the posix convention of "hiding" files/directories by starting the name
with a dot, it'd be hard to rely on extensions in any naive way anyhow.  ;)

-- Chris N-S

[1] I'd just about expect the filesystem of BeOS/Haiku to do so, or
something similar to it at least.

[2] Also not saying I wouldn't want to see it, necessarily. Done right,
it'd be a damn nifty thing.

Mar 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 
 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.

 
 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really* does is
 use some complex black-box-ish algorithm that takes into account the
 first few bytes of the file, the extention, the exec flag, and
 god-knows-what-else to determine what type of file it is. Contrary to
 how people keep making it sound, mime type is *not* the determining
 factor (and cannot possibly be), but rather nothing more than the way
 the *result* of all that analysis is represented.

 
 I thought that the first few bytes of the file _were_ the mime type.
 Certainly, from what I've seen, extension has _no_ effect on most
 programs. Konqueror certainly acts like it does everything by mime type
 - file associations are set that way.
 
 - Jonathan M Davis

 
 As someone who uses hex editors quite a bit (resorting these days to
 using Okteta mainly), I can tell you I have yet to see any file's mime
 embedded at the beginning, nor have I seen it in any headers/nodes when
 scanning raw.  Doesn't mean it's impossible of course, and certain file
 systems certainly might do this[1] but I haven't seen it yet[2].
 
 You are quite right, though, that extension doesn't matter at all,
 except in certain corner cases.  Even then, they are reasonable and
 predictable things -- like SO's having the right extension.  Considering
 the posix convention of "hiding" files/directories by starting the name
 with a dot, it'd be hard to rely on extensions in any naive way anyhow.  ;)
 
 -- Chris N-S
 
 [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or
 something similar to it at least.
 
 [2] Also not saying I wouldn't want to see it, necessarily. Done right,
 it'd be a damn nifty thing.

I've never studied mime types, so I don't know much about them. It's just that 
it was my understanding the the first few bytes in a file indicated its mime
type. 
If that isn't the case, I have no idea how you determine the mime type of a
file 
or what's involved in doing so. I _would_, however, like to have a way to get a 
file's mime type in D, so one of these days, I'll likely be looking into the 
matter.

- Jonathan M Davis

Mar 06 2011

Johannes Pfau <spam example.com> writes:

Jonathan M Davis wrote:
On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
=20
 This reminds me. I should look into mime types one of these days
 to see what the
 appropriate way (if any) would be to put support for them in
 Phobos. It would be
 nice to not have to go by extension for the few programs that I
 have which have
 to worry about file type.

=20
 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really*
 does is use some complex black-box-ish algorithm that takes into
 account the first few bytes of the file, the extention, the exec
 flag, and god-knows-what-else to determine what type of file it
 is. Contrary to how people keep making it sound, mime type is
 *not* the determining factor (and cannot possibly be), but rather
 nothing more than the way the *result* of all that analysis is
 represented.

=20
 I thought that the first few bytes of the file _were_ the mime
 type. Certainly, from what I've seen, extension has _no_ effect on
 most programs. Konqueror certainly acts like it does everything by
 mime type
 - file associations are set that way.
=20
 - Jonathan M Davis

=20
 As someone who uses hex editors quite a bit (resorting these days to
 using Okteta mainly), I can tell you I have yet to see any file's
 mime embedded at the beginning, nor have I seen it in any
 headers/nodes when scanning raw.  Doesn't mean it's impossible of
 course, and certain file systems certainly might do this[1] but I
 haven't seen it yet[2].
=20
 You are quite right, though, that extension doesn't matter at all,
 except in certain corner cases.  Even then, they are reasonable and
 predictable things -- like SO's having the right extension.
 Considering the posix convention of "hiding" files/directories by
 starting the name with a dot, it'd be hard to rely on extensions in
 any naive way anyhow.  ;)
=20
 -- Chris N-S
=20
 [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or
 something similar to it at least.
=20
 [2] Also not saying I wouldn't want to see it, necessarily. Done
 right, it'd be a damn nifty thing.

I've never studied mime types, so I don't know much about them. It's
just that it was my understanding the the first few bytes in a file
indicated its mime type. If that isn't the case, I have no idea how
you determine the mime type of a file or what's involved in doing so.
I _would_, however, like to have a way to get a file's mime type in D,
so one of these days, I'll likely be looking into the matter.

- Jonathan M Davis

The mime type can be saved as meta data on some filesystems, but it's
not in the file, it's an attribute:
-----------------------------------------------------
Storing the MIME type using Extended Attributes

An implementation MAY also get a file's MIME type from the
user.mime_type extended attribute. The type given here should normally
be used in preference to any guessed type, since the user is able to
set it explicitly. Applications MAY choose to set the type when saving
files. Since many applications and filesystems do not support extended
attributes, implementations MUST NOT rely on this method being
available.
-----------------------------------------------------

If this method is not available, programs look at the content of files
for specific patterns to guess the mime type. It's not the mime type
that is saved in the file though. Consider an mp3 file: there's no
"audio/mp3" in the file, but there always is a mp3 header. If a file
is scanned and a mp3 header is found, it's safe to assume the mime
type. Most file formats also have some kind of magic number at the
beginning, so it's easier to detect those.

More information:
http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spe=
c-latest.html=20
--=20
Johannes Pfau

Mar 07 2011

spir <denis.spir gmail.com> writes:

On 03/07/2011 09:19 AM, Johannes Pfau wrote:
 Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...

 This reminds me. I should look into mime types one of these days
 to see what the
 appropriate way (if any) would be to put support for them in
 Phobos. It would be
 nice to not have to go by extension for the few programs that I
 have which have
 to worry about file type.

 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really*
 does is use some complex black-box-ish algorithm that takes into
 account the first few bytes of the file, the extention, the exec
 flag, and god-knows-what-else to determine what type of file it
 is. Contrary to how people keep making it sound, mime type is
 *not* the determining factor (and cannot possibly be), but rather
 nothing more than the way the *result* of all that analysis is
 represented.

 I thought that the first few bytes of the file _were_ the mime
 type. Certainly, from what I've seen, extension has _no_ effect on
 most programs. Konqueror certainly acts like it does everything by
 mime type
 - file associations are set that way.

 - Jonathan M Davis

 As someone who uses hex editors quite a bit (resorting these days to
 using Okteta mainly), I can tell you I have yet to see any file's
 mime embedded at the beginning, nor have I seen it in any
 headers/nodes when scanning raw.  Doesn't mean it's impossible of
 course, and certain file systems certainly might do this[1] but I
 haven't seen it yet[2].

 You are quite right, though, that extension doesn't matter at all,
 except in certain corner cases.  Even then, they are reasonable and
 predictable things -- like SO's having the right extension.
 Considering the posix convention of "hiding" files/directories by
 starting the name with a dot, it'd be hard to rely on extensions in
 any naive way anyhow.  ;)

 -- Chris N-S

 [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or
 something similar to it at least.

 [2] Also not saying I wouldn't want to see it, necessarily. Done
 right, it'd be a damn nifty thing.

 I've never studied mime types, so I don't know much about them. It's
 just that it was my understanding the the first few bytes in a file
 indicated its mime type. If that isn't the case, I have no idea how
 you determine the mime type of a file or what's involved in doing so.
 I _would_, however, like to have a way to get a file's mime type in D,
 so one of these days, I'll likely be looking into the matter.

 - Jonathan M Davis

 The mime type can be saved as meta data on some filesystems, but it's
 not in the file, it's an attribute:
 -----------------------------------------------------
 Storing the MIME type using Extended Attributes

 An implementation MAY also get a file's MIME type from the
 user.mime_type extended attribute. The type given here should normally
 be used in preference to any guessed type, since the user is able to
 set it explicitly. Applications MAY choose to set the type when saving
 files. Since many applications and filesystems do not support extended
 attributes, implementations MUST NOT rely on this method being
 available.
 -----------------------------------------------------

 If this method is not available, programs look at the content of files
 for specific patterns to guess the mime type. It's not the mime type
 that is saved in the file though. Consider an mp3 file: there's no
 "audio/mp3" in the file, but there always is a mp3 header. If a file
 is scanned and a mp3 header is found, it's safe to assume the mime
 type. Most file formats also have some kind of magic number at the
 beginning, so it's easier to detect those.

 More information:
 http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html

I would definitely love an inter-OS standard for storing the MIME-type in every 
file's first byte. Esp. the text encoding, when it's text (ask Walter why D 
only supports UTF's, and even then the cost in complexity just to determine 
which UTF (including byte-order!)). But we're not in such a world.
And you can be sure that numerous (super C experts) would oppose this because 
of the space cost.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 07 2011

Adam D. Ruppe <destructionator gmail.com> writes:

spir wrote:
 I would definitely love an inter-OS standard for storing the
 MIME-type in every file's first byte.

A better solution would be to store it in the filename. Might
want more detail than one byte could allow too, so perhaps allowing
three or four bytes would be a good answer.

With the type in the filename, you can determine it easily from
a directory listing without needing to open every individual file.
This would make a big difference in listing speed on a slow filesystem
and by using the name, it is compatible with all systems too.

Mar 07 2011

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 07 Mar 2011 15:07:59 -0000, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 spir wrote:
 I would definitely love an inter-OS standard for storing the
 MIME-type in every file's first byte.

 A better solution would be to store it in the filename. Might
 want more detail than one byte could allow too, so perhaps allowing
 three or four bytes would be a good answer.

 With the type in the filename, you can determine it easily from
 a directory listing without needing to open every individual file.
 This would make a big difference in listing speed on a slow filesystem
 and by using the name, it is compatible with all systems too.

:P

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Mar 07 2011

"Nick Sabalausky" <a a.a> writes:

"Adam D. Ruppe" <destructionator gmail.com> wrote in message 
news:il2sce$11lv$1 digitalmars.com...
 spir wrote:
 I would definitely love an inter-OS standard for storing the
 MIME-type in every file's first byte.

 A better solution would be to store it in the filename. Might
 want more detail than one byte could allow too, so perhaps allowing
 three or four bytes would be a good answer.

 With the type in the filename, you can determine it easily from
 a directory listing without needing to open every individual file.
 This would make a big difference in listing speed on a slow filesystem
 and by using the name, it is compatible with all systems too.

I agree, and have to say: Very well put :)

Mar 07 2011

Bekenn <leaveme alone.com> writes:

On 3/7/2011 7:07 AM, Adam D. Ruppe wrote:
 A better solution would be to store it in the filename. Might
 want more detail than one byte could allow too, so perhaps allowing
 three or four bytes would be a good answer.

 With the type in the filename, you can determine it easily from
 a directory listing without needing to open every individual file.
 This would make a big difference in listing speed on a slow filesystem
 and by using the name, it is compatible with all systems too.

Along those same lines: 
http://blogs.msdn.com/b/oldnewthing/archive/2009/04/15/9549682.aspx

Mar 07 2011

Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:

Jonathan M Davis wrote:

 On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 
 This reminds me. I should look into mime types one of these days to
 see what the
 appropriate way (if any) would be to put support for them in Phobos.
 It would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.

 
 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really* does is
 use some complex black-box-ish algorithm that takes into account the
 first few bytes of the file, the extention, the exec flag, and
 god-knows-what-else to determine what type of file it is. Contrary to
 how people keep making it sound, mime type is *not* the determining
 factor (and cannot possibly be), but rather nothing more than the way
 the *result* of all that analysis is represented.

 
 I thought that the first few bytes of the file _were_ the mime type.
 Certainly, from what I've seen, extension has _no_ effect on most
 programs. Konqueror certainly acts like it does everything by mime type
 - file associations are set that way.
 
 - Jonathan M Davis

 
 As someone who uses hex editors quite a bit (resorting these days to
 using Okteta mainly), I can tell you I have yet to see any file's mime
 embedded at the beginning, nor have I seen it in any headers/nodes when
 scanning raw.  Doesn't mean it's impossible of course, and certain file
 systems certainly might do this[1] but I haven't seen it yet[2].
 
 You are quite right, though, that extension doesn't matter at all,
 except in certain corner cases.  Even then, they are reasonable and
 predictable things -- like SO's having the right extension.  Considering
 the posix convention of "hiding" files/directories by starting the name
 with a dot, it'd be hard to rely on extensions in any naive way anyhow. 
 ;)
 
 -- Chris N-S
 
 [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or
 something similar to it at least.
 
 [2] Also not saying I wouldn't want to see it, necessarily. Done right,
 it'd be a damn nifty thing.

 
 I've never studied mime types, so I don't know much about them. It's just
 that it was my understanding the the first few bytes in a file indicated
 its mime type. If that isn't the case, I have no idea how you determine
 the mime type of a file or what's involved in doing so. I _would_,
 however, like to have a way to get a file's mime type in D, so one of
 these days, I'll likely be looking into the matter.
 
 - Jonathan M Davis

A good place to start is likely freedesktop.org, which maintains 
specifications, libraries and utilities aimed at enhancing interoperability 
between desktop systems. This is the page about mime types:

http://freedesktop.org/wiki/Specifications/shared-mime-info-spec

Mar 07 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I've just reported two issues with std.path.join:
http://d.puremagic.com/issues/show_bug.cgi?id=5758
http://d.puremagic.com/issues/show_bug.cgi?id=5759

Does pathJoiner suffer from the same problems?

Mar 20 2011

"Nick Sabalausky" <a a.a> writes:

"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.2639.1300648308.4748.digitalmars-d puremagic.com...
 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758

Ugh, phobos has a real problem with ctfe. There's a lot that doesn't work as 
ctfe, but should. But worse than that, regressions with ctfe-ability seem to 
be extremely common.

 http://d.puremagic.com/issues/show_bug.cgi?id=5759

 Does pathJoiner suffer from the same problems?

Mar 20 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 "Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message
 news:mailman.2639.1300648308.4748.digitalmars-d puremagic.com...
 
 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758

 
 Ugh, phobos has a real problem with ctfe. There's a lot that doesn't work
 as ctfe, but should. But worse than that, regressions with ctfe-ability
 seem to be extremely common.

Probably because CTFE is a bit of a black art with regards to what works and 
what doesn't. So, it's not always obvious when something will be CTFE-able or 
not. I expect that the only way to really solve this is to decide which 
functions must be CTFE-able and add unit tests which fail if they aren't. As 
it becomes possible to make more functions CTFE-able, they can be made CTFE-
able and have the appropriate unit tests added. But as long as none of the 
Phobos devs are really worrying about whether functions are CTFE-able or not 
(and I don't get the impression that we generally I - I certainly don't think 
about it most of the time), they're _not_ going to notice whether the CTFE-
ability of a function changes. Though honestly, enough of Phobos is in flux 
and CTFE is enough of a black box that I'm not sure that it's yet entirely 
reasonable to require that Phobos functions stay CTFE-able once they're CTFE-
able. It could be that fixing a particular bug or improving the overall design 
of a portion of Phobos could easily result in something becoming non-CTFE-
able.

Regardless, in the long run (if not the short run), this issue does need to be 
addressed, and Phobos devs should likely be more aware of it in general.

- Jonathan M Davis

Mar 20 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sun, 20 Mar 2011 20:11:36 +0100, Andrej Mitrovic wrote:

 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
 http://d.puremagic.com/issues/show_bug.cgi?id=5759
 
 Does pathJoiner suffer from the same problems?

Are you referring to joinPath() in my code?  If so, no.  It works at 
compile time, and correctly joins the paths.

-Lars

Mar 20 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/20/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 On Sun, 20 Mar 2011 20:11:36 +0100, Andrej Mitrovic wrote:

 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
 http://d.puremagic.com/issues/show_bug.cgi?id=5759

 Does pathJoiner suffer from the same problems?

 Are you referring to joinPath() in my code?  If so, no.  It works at
 compile time, and correctly joins the paths.

 -Lars

Fantastic. What about issue 5759, can it work properly so:
joinPath(curdir, r"\subdir\") == r".\subdir\"

Maybe that's Windows-only, dunno.

Mar 20 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/20/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 On 3/20/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 On Sun, 20 Mar 2011 20:11:36 +0100, Andrej Mitrovic wrote:

 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
 http://d.puremagic.com/issues/show_bug.cgi?id=5759

 Does pathJoiner suffer from the same problems?

 Are you referring to joinPath() in my code?  If so, no.  It works at
 compile time, and correctly joins the paths.

 -Lars

 Fantastic. What about issue 5759, can it work properly so:
 joinPath(curdir, r"\subdir\") == r".\subdir\"

 Maybe that's Windows-only, dunno.

Sorry, I'm stupid and didn't read the entirety of your post. It does
work if you said so. :)

Mar 20 2011

"Nick Sabalausky" <a a.a> writes:

"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2298.1299479088.4748.digitalmars-d puremagic.com...
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...

 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.

 I'm no unix expert, but my understanding is that mime types in the
 filesystem don't even exist at all, and that what it *really* does is use
 some complex black-box-ish algorithm that takes into account the first 
 few
 bytes of the file, the extention, the exec flag, and god-knows-what-else 
 to
 determine what type of file it is. Contrary to how people keep making it
 sound, mime type is *not* the determining factor (and cannot possibly 
 be),
 but rather nothing more than the way the *result* of all that analysis is
 represented.

 I thought that the first few bytes of the file _were_ the mime type. 
 Certainly,
 from what I've seen, extension has _no_ effect on most programs. Konqueror
 certainly acts like it does everything by mime type - file associations 
 are set
 that way.

No, MIME is a text-based filetype-naming system thst originated from SMTP 
and then got adopted by HTTP and various other things. It's like a really 
verbose file extension that isn't stored as part of the filename. These are 
some MIME types:

application/json
application/soap+xml
application/xhtml+xml
application/x-gzip
image/jpeg
text/plain
text/xml
video/mp4
application/x-www-form-urlencoded

More info:
http://en.wikipedia.org/wiki/Mime_type

Mar 07 2011

Christopher Nicholson-Sauls <ibisbasenji gmail.com> writes:

On 03/07/11 00:09, Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 This reminds me. I should look into mime types one of these days to see 
 what the
 appropriate way (if any) would be to put support for them in Phobos. It 
 would be
 nice to not have to go by extension for the few programs that I have which 
 have
 to worry about file type.

 
 I'm no unix expert, but my understanding is that mime types in the 
 filesystem don't even exist at all, and that what it *really* does is use 
 some complex black-box-ish algorithm that takes into account the first few 
 bytes of the file, the extention, the exec flag, and god-knows-what-else to 
 determine what type of file it is. Contrary to how people keep making it 
 sound, mime type is *not* the determining factor (and cannot possibly be), 
 but rather nothing more than the way the *result* of all that analysis is 
 represented.
 
 

One could likely get a good grip of the "black box" by studying the
source of the common "file" utility.  It can be surprisingly detailed in
some cases, such as the following real example:

$ file debug.log
debug.log: UTF-8 Unicode English text, with very long lines

It does generate mime types as well:

$ file -bi debug.log
text/plain; charset=utf-8

-- Chris N-S

Mar 06 2011

spir <denis.spir gmail.com> writes:

On 03/07/2011 02:06 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 16:54:41 spir wrote:
 On 03/07/2011 01:44 AM, Jonathan M Davis wrote:
 I think whatever you choose will not please everybody, so just choose

   something and stick with it. Regarding all the extension naming
   stuff, I suggest you go with the "suffix" nomenclature which is more
   general and applicable to all OSs.


 I agree with Lars on this one. Everyone knows what an extension is. It's
 a universal concept even if it's not used as much on non-Windows OSes.
 There _are_ plenty of programs in *nix which use it internally (likely
 because it's a lot easier than dealing with mime type) even if they
 shouldn't.

 eg: numerous compilers, programming editors,... ;-)

 The one that really bit me IIRC was Audacious. I had some newly ripped music
 files which it wouldn't play.  As it turns out, the problem was that I had had
to
 redo the settings on my ripping program shortly before, and I had forgotten to
 put the extension in the file name, so the newly ripped files had no
extensions,
 and Audiacious apparently used the extension to determine whether it could play
 a particular file. So, of course, it wouldn't play my files, since they had no
 extensions. Unfortunately, it took me quite a while to figure that out, and I
 ended up on a bit of a wild goose chase in the interim...

 This reminds me. I should look into mime types one of these days to see what
the
 appropriate way (if any) would be to put support for them in Phobos. It would
be
 nice to not have to go by extension for the few programs that I have which have
 to worry about file type.

I'd say: MIME types are another wild goose chase field ;-)

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 06 2011

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 03/03/2011 16:29, Lars T. Kyllingstad wrote:
As mentioned in the "std.path.getName(): Screwy by design?" thread, I
started working on a rewrite of std.path a long time ago, but I got
sidetracked by other things. The recent discussion got me working on it
again, and it turned out there wasn't that much left to be done.

So here it is, please comment:

http://kyllingen.net/code/ltk/doc/path.html
https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

Features:

- Most functions work with all string types, i.e. all permutations of
mutable/const/immutable(char/wchar/dchar)[]. Notable exceptions are
toAbsolute() and toCanonical, because they rely on std.file.getcwd()
which returns an immutable(char)[].

- Correct behaviour in corner cases that aren't covered by the current
std.path. See the other thread for some examples, or take a look at the
unittests for a more complete picture.

- Saner naming scheme. (Still not set in stone, of course.)

-Lars

I hope I'm not too late for the party, especially because I do have a
bit of criticism for this one...
Looking at the DDoc page, this module seem to have very
platform-dependent behavior. I find this detrimental, even unsavory. I
think it's best that programs work with internal data structures that
are as platform-independent as possible, and only convert to
platform-dependent data or API at the very last possible moment, when so
required (ie, when interfacing with the actual OS, or with the user).

So, with that in mind, there is a toCanonical function that converts to
a OS specific format, but there's no function to convert to an
OS/platform independent format?... :S

Also, what does dirName( "d:file") return on POSIX? Is it the same as on
Windows? I hope so, and that such behavior is explicitly part of the API
and not just accidental. (I don't a linux machine nearby to try it out
myself) Because, what if I want to refer to Windows paths from a POSIX
application? (I'm sure there are scenarios where that makes sense)

Or what if I just want my application to behave in a pedantically
platform-identical way, like having it to accept backlashes as path
separators not just on Windows but on POSIX as well? (This makes much
more sense than is immediately obvious... in many cases it can be argued
to be the Right Thing)

I'm sorry if I seem a bit agitated :P , it's just that due to some more
or less recent traumatizing events (a long story relating to Windows 7)
I have become a Crusader for cross-platformness.

The other suggestion I have (mentioned by others as well) is to
generalize the driver letter to a device symbol/string/identifier. But
this only makes sense if this device segment works in a
platform-independent way. This generalization might make the path module
useful in a few new contexts. Note, I'm not saying it should handle
URIs, in fact I want to explicitly say it should not handle URIs, as
URIs have additional semantics (query and fragment parts, the percent
encoding, etc.) which should not be of concern here.

BTW, I admit I take some inspiration from this API:
http://help.eclipse.org/helios/index.jsp?topic=/org.eclipse.platform.doc.isv/reference/api/org/eclipse/core/runtime/IPath.html
Note that here there is only *one* platform dependent function, the
aptly named toOSString() ...

--
Bruno Medeiros - Software Engineer

Apr 06 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Wed, 06 Apr 2011 15:51:15 +0100, Bruno Medeiros wrote:

 On 03/03/2011 16:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I hope I'm not too late for the party, especially because I do have a
 bit of criticism for this one...

Not at all.  Reviews of, and further work on, std.path has been put on 
hold until I have handed in my PhD thesis (which, if all goes well, 
should be very soon).  I haven't got time to participate in any extensive 
discussions on the NG right now.  So there will be ample opportunity to 
comment on the design yet. :)


 Looking at the DDoc page, this module seem to have very
 platform-dependent behavior. I find this detrimental, even unsavory. I
 think it's best that programs work with internal data structures that
 are as platform-independent as possible, and only convert to
 platform-dependent data or API at the very last possible moment, when so
 required (ie, when interfacing with the actual OS, or with the user).
 
 So, with that in mind, there is a toCanonical function that converts to
 a OS specific format, but there's no function to convert to an
 OS/platform independent format?... :S
 
 Also, what does dirName( "d:file") return on POSIX? Is it the same as on
 Windows? I hope so, and that such behavior is explicitly part of the API
 and not just accidental. (I don't a linux machine nearby to try it out
 myself) Because, what if I want to refer to Windows paths from a POSIX
 application? (I'm sure there are scenarios where that makes sense)
 
 Or what if I just want my application to behave in a pedantically
 platform-identical way, like having it to accept backlashes as path
 separators not just on Windows but on POSIX as well? (This makes much
 more sense than is immediately obvious... in many cases it can be argued
 to be the Right Thing)
 
 
 I'm sorry if I seem a bit agitated :P , it's just that due to some more
 or less recent traumatizing events (a long story relating to Windows 7)
 I have become a Crusader for cross-platformness.
 
 
 The other suggestion I have (mentioned by others as well) is to
 generalize the driver letter to a device symbol/string/identifier. But
 this only makes sense if this device segment works in a
 platform-independent way. This generalization might make the path module
 useful in a few new contexts. Note, I'm not saying it should handle
 URIs, in fact I want to explicitly say it should not handle URIs, as
 URIs have additional semantics (query and fragment parts, the percent
 encoding, etc.) which should not be of concern here.
 
 BTW, I admit I take some inspiration from this API:
 http://help.eclipse.org/helios/index.jsp?topic=/

org.eclipse.platform.doc.isv/reference/api/org/eclipse/core/runtime/
IPath.html
 Note that here there is only *one* platform dependent function, the
 aptly named toOSString() ...

Thanks for the feedback, I will read it more thoroughly when I take up 
work on std.path again.  Just a general comment, though:  Having the 
exact same functionality on Windows and POSIX just doesn't work, if 
nothing else simply because "c:\dir\file" is a valid base name on POSIX.  
That is, both ':' and '\' are valid filename characters.  The ONLY 
invalid filename characters on POSIX are '/' and '\0'.

Yes, weird file names like that may be uncommon, but the library should 
be able to handle them nonetheless.

-Lars

Apr 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

 On Wed, 06 Apr 2011 15:51:15 +0100, Bruno Medeiros wrote:
 On 03/03/2011 16:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 
 I hope I'm not too late for the party, especially because I do have a
 bit of criticism for this one...

 
 Not at all.  Reviews of, and further work on, std.path has been put on
 hold until I have handed in my PhD thesis (which, if all goes well,
 should be very soon).  I haven't got time to participate in any extensive
 discussions on the NG right now.  So there will be ample opportunity to
 comment on the design yet. :)
 
 Looking at the DDoc page, this module seem to have very
 platform-dependent behavior. I find this detrimental, even unsavory. I
 think it's best that programs work with internal data structures that
 are as platform-independent as possible, and only convert to
 platform-dependent data or API at the very last possible moment, when so
 required (ie, when interfacing with the actual OS, or with the user).
 
 So, with that in mind, there is a toCanonical function that converts to
 a OS specific format, but there's no function to convert to an
 OS/platform independent format?... :S
 
 Also, what does dirName( "d:file") return on POSIX? Is it the same as on
 Windows? I hope so, and that such behavior is explicitly part of the API
 and not just accidental. (I don't a linux machine nearby to try it out
 myself) Because, what if I want to refer to Windows paths from a POSIX
 application? (I'm sure there are scenarios where that makes sense)
 
 Or what if I just want my application to behave in a pedantically
 platform-identical way, like having it to accept backlashes as path
 separators not just on Windows but on POSIX as well? (This makes much
 more sense than is immediately obvious... in many cases it can be argued
 to be the Right Thing)
 
 
 I'm sorry if I seem a bit agitated :P , it's just that due to some more
 or less recent traumatizing events (a long story relating to Windows 7)
 I have become a Crusader for cross-platformness.
 
 
 The other suggestion I have (mentioned by others as well) is to
 generalize the driver letter to a device symbol/string/identifier. But
 this only makes sense if this device segment works in a
 platform-independent way. This generalization might make the path module
 useful in a few new contexts. Note, I'm not saying it should handle
 URIs, in fact I want to explicitly say it should not handle URIs, as
 URIs have additional semantics (query and fragment parts, the percent
 encoding, etc.) which should not be of concern here.
 
 BTW, I admit I take some inspiration from this API:
 http://help.eclipse.org/helios/index.jsp?topic=/

 
 org.eclipse.platform.doc.isv/reference/api/org/eclipse/core/runtime/
 IPath.html
 
 Note that here there is only *one* platform dependent function, the
 aptly named toOSString() ...

 
 Thanks for the feedback, I will read it more thoroughly when I take up
 work on std.path again.  Just a general comment, though:  Having the
 exact same functionality on Windows and POSIX just doesn't work, if
 nothing else simply because "c:\dir\file" is a valid base name on POSIX.
 That is, both ':' and '\' are valid filename characters.  The ONLY
 invalid filename characters on POSIX are '/' and '\0'.
 
 Yes, weird file names like that may be uncommon, but the library should
 be able to handle them nonetheless.

And on some file systems, even / is valid! Though it's not worth it to try and 
get std.path to work with files with / in the name. It's generally a very bad 
idea to create a file with a / in the name - too many programs would choke on 
it or just plain have the wrong behavior. However, there _are_ *nix file 
systems which allow for / in file names.

- Jonathan M Davis

Apr 07 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Thu, 07 Apr 2011 03:57:18 -0700, Jonathan M Davis wrote:
 
 And on some file systems, even / is valid! Though it's not worth it to
 try and get std.path to work with files with / in the name. It's
 generally a very bad idea to create a file with a / in the name - too
 many programs would choke on it or just plain have the wrong behavior.
 However, there _are_ *nix file systems which allow for / in file names.


Which filesystems are those?  The POSIX:2008 specification specifically 
states that

    "The characters composing the name may be selected from
     the set of all character values excluding the <slash>
     character and the null byte."

where <slash> is defined as '/'.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html

-Lars

Apr 07 2011

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

On 07/04/2011 09:32, Lars T. Kyllingstad wrote:
 On Wed, 06 Apr 2011 15:51:15 +0100, Bruno Medeiros wrote:

 Thanks for the feedback, I will read it more thoroughly when I take up
 work on std.path again.  Just a general comment, though:  Having the
 exact same functionality on Windows and POSIX just doesn't work, if
 nothing else simply because "c:\dir\file" is a valid base name on POSIX.
 That is, both ':' and '\' are valid filename characters.  The ONLY
 invalid filename characters on POSIX are '/' and '\0'.

 Yes, weird file names like that may be uncommon, but the library should
 be able to handle them nonetheless.

 -Lars

Yeah, that's a good point. I'm sure yet if there is a good way that 
could address both issues, I want to think about it more later.
(in Eclipse's IPath this is less of a problem because that API works 
with a path data type, not with a path string directly)

-- 
Bruno Medeiros - Software Engineer

Apr 13 2011

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On 2011-04-07 04:38, Lars T. Kyllingstad wrote:
 On Thu, 07 Apr 2011 03:57:18 -0700, Jonathan M Davis wrote:
 And on some file systems, even / is valid! Though it's not worth it to
 try and get std.path to work with files with / in the name. It's
 generally a very bad idea to create a file with a / in the name - too
 many programs would choke on it or just plain have the wrong behavior.
 However, there _are_ *nix file systems which allow for / in file names.

 
 Which filesystems are those?  The POSIX:2008 specification specifically
 states that
 
     "The characters composing the name may be selected from
      the set of all character values excluding the <slash>
      character and the null byte."
 
 where <slash> is defined as '/'.
 
 http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html

I didn't know that Posix had anything to say on the matter (though it doesn't 
hurt my feelings any that it effectively says that / isn't valid in file 
names). However, the file systems themselves apparently don't necessarily 
stick to that. If you take a look at 
http://en.wikipedia.org/wiki/Comparison_of_file_systems you can see which file 
systems allow which characters. For instance, the exts disallow NUL and /. 
However ReiserFS, Btrfs, JFS, and XFS allow /. In fact, most of the Linux file 
systems seem to allow / (though the exts are probably the most used and they 
don't).

Still, Posix or no, I would expect that using / in a file name would be just 
asking for trouble and find no reason to support it in std.path (particularly 
when we'd rely on the underlying C calls handling it appropriately, and I 
expect that there's a good chance that they don't). But if Posix disallows it, 
then we definitely shouldn't. Still, the file systems themselves aren't 
necessarily Posix-related, and apparently quite a few of the *nix file systems 
allow /.

- Jonathan M Davis

Apr 07 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Proposal for std.path replacement