www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposal for std.path replacement

reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
started working on a rewrite of std.path a long time ago, but I got 
sidetracked by other things.  The recent discussion got me working on it 
again, and it turned out there wasn't that much left to be done.

So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

Features:

- Most functions work with all string types, i.e. all permutations of 
mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are 
toAbsolute() and toCanonical, because they rely on std.file.getcwd() 
which returns an immutable(char)[].

- Correct behaviour in corner cases that aren't covered by the current 
std.path.  See the other thread for some examples, or take a look at the 
unittests for a more complete picture.

- Saner naming scheme.  (Still not set in stone, of course.)

-Lars
Mar 03 2011
next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Lars T. Kyllingstad Wrote:

 As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
 started working on a rewrite of std.path a long time ago, but I got 
 sidetracked by other things.  The recent discussion got me working on it 
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
Well I'll vote yes. Behavior looks very clean.
Mar 03 2011
prev sibling next sibling parent reply Jerry Quinn <jlquinn optonline.net> writes:
Lars T. Kyllingstad Wrote:

 As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
 started working on a rewrite of std.path a long time ago, but I got 
 sidetracked by other things.  The recent discussion got me working on it 
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
Rather than: drive() & stripDrive() extension() & stripExtension() would it make sense to combine them? string[2] splitDrive(path) string[2] splitExtension(path) Just a thought. Jerry
Mar 03 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, March 03, 2011 10:31:20 Jerry Quinn wrote:
 Lars T. Kyllingstad Wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
Rather than: drive() & stripDrive() extension() & stripExtension() would it make sense to combine them? string[2] splitDrive(path) string[2] splitExtension(path)
Those might not be bad functions to have, but they could get _really_ annoying if they were there _instead_ of the split functions. I, for one, am very likely to calling functions like stripExtension and passing the result directly into another function or using it directly in an expression. Having to throw a [0] or [1] on the end of all those calls would be ugly and error-prone (since it would be really easy to use the wrong index) - not to mention, it would be less efficient, which could matter in cases where you have to process a lot of file names. So, they might not be bad functions to add, but I certainly wouldn't want to see them replace the strip functions. - Jonathan M Davis
Mar 03 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, March 03, 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
Some comments on names: 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are probably okay, but basename should definitely be baseName. 2. Please shorten the Separator in the names to Sep (i.e. dirSep, pathSep, and isDirSep). They're just as clear that way and less annoyingly long. Similarly, I'd rename currentDirSymbol to currDirSymbol - or maybe even have currDirSym and parentDirSym. 3. I'd prefer dirName to directory. It's shorter, closer to what was there before, and a better name IMHO. directory makes me wonder if it's checking whether the name is a directory or not (which is what std.file.isDir does). 4. It might be better to short extension/Extension to ext/Ext, but that works better with functions like stripExtension (stripExt), then it would extension (ext) by itself, and if we wanted complete consistency, then changing Extension to Ext would mean changing extension to ext, which wouldn't really be desirable. I'd still be very tempted to rename the xExtension functions to xExt though. Extension is unnecessarily long. 5. setExtension might be better as replaceExtension, since set tends to imply that you're doing the change to the passed in string rather than just returning a new string with the changes. 6. I'd strongly suggest making most of the functions properties (though that would require changing the examples). Functions which are nouns (such as drive or extension) really should be properties, otherwise they shouldn't have names which are nouns. basename/baseName is a funny one though since it's a noun and really should be a property, but it does have a version which takes an extra parameter, so I'm not sure what to do with that one. Unfortunately, for some reason, at the moment you can't overload property function with a non-property function (I keep meaning to open an enhancement request on that). As far as examples go, assuming that you made it so that .bashrc is a file with a base name of .bashrc and no extension rather than a file with no base name and an extension of bashrc (I haven't looked at the implementation at all yet, so I don't know what you did with that), then you really should put it (or an example like it) in the examples. At a first glance, it looks good overall, but I really think that the noun functions should become properties or have their names changed and that some of the names really should be shortened. We want properly descriptive names, but it doesn't take that much for a longer name to get irritating. - Jonathan M Davis
Mar 03 2011
next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Jonathan M Davis Wrote:

 As far as examples go, assuming that you made it so that .bashrc is a file
with a 
 base name of .bashrc and no extension rather than a file with no base name and
an 
 extension of bashrc (I haven't looked at the implementation at all yet, so I 
 don't know what you did with that), then you really should put it (or an
example 
 like it) in the examples.
He did. Empty extension with .bashrc name.
Mar 03 2011
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 03 Mar 2011 10:39:38 -0800, Jonathan M Davis wrote:

 On Thursday, March 03, 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
Some comments on names: 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are probably okay, but basename should definitely be baseName.
We probably couldn't disagree more. :) I think fncharmatch is a horrible name. On the other hand, basename() is named after the 'basename' UNIX utility, and doesn't mean anything on its own. At least, I've never heard of such a thing as the "base name" of a file, but please prove me wrong.
 2. Please shorten the Separator in the names to Sep (i.e. dirSep,
 pathSep, and isDirSep). They're just as clear that way and less
 annoyingly long. Similarly, I'd rename currentDirSymbol to currDirSymbol
 - or maybe even have currDirSym and parentDirSym.

 3. I'd prefer dirName to directory. It's shorter, closer to what was
 there before, and a better name IMHO. directory makes me wonder if it's
 checking whether the name is a directory or not (which is what
 std.file.isDir does).
 
 4. It might be better to short extension/Extension to ext/Ext, but that
 works better with functions like stripExtension (stripExt), then it
 would extension (ext) by itself, and if we wanted complete consistency,
 then changing Extension to Ext would mean changing extension to ext,
 which wouldn't really be desirable. I'd still be very tempted to rename
 the xExtension functions to xExt though. Extension is unnecessarily
 long.
I have a preference for the longer names, but not a very strong one. I'm not going to oppose the changes if others agree with you.
 5. setExtension might be better as replaceExtension, since set tends to
 imply that you're doing the change to the passed in string rather than
 just returning a new string with the changes.
Good point.
 6. I'd strongly suggest making most of the functions properties (though
 that would require changing the examples). Functions which are nouns
 (such as drive or extension) really should be properties, otherwise they
 shouldn't have names which are nouns. basename/baseName is a funny one
 though since it's a noun and really should be a property, but it does
 have a version which takes an extra parameter, so I'm not sure what to
 do with that one. Unfortunately, for some reason, at the moment you
 can't overload property function with a non-property function (I keep
 meaning to open an enhancement request on that).
Also a good point. Not only that, most functions should be pure safe nothrow, but I've completely forgotten to add these annotations!
 As far as examples go, assuming that you made it so that .bashrc is a
 file with a base name of .bashrc and no extension rather than a file
 with no base name and an extension of bashrc (I haven't looked at the
 implementation at all yet, so I don't know what you did with that), then
 you really should put it (or an example like it) in the examples.
It's in the examples for extension() and stripExtension().
 At a first glance, it looks good overall, but I really think that the
 noun functions should become properties or have their names changed and
 that some of the names really should be shortened. We want properly
 descriptive names, but it doesn't take that much for a longer name to
 get irritating.
 
 - Jonathan M Davis
Thanks for the feedback! -Lars
Mar 04 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 03/04/2011 09:33 AM, Lars T. Kyllingstad wrote:
 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are
  probably okay, but basename should definitely be baseName.
We probably couldn't disagree more. :) I think fncharmatch is a horrible name. On the other hand, basename() is named after the 'basename' UNIX utility, and doesn't mean anything on its own. At least, I've never heard of such a thing as the "base name" of a file, but please prove me wrong.
I agree with Jonathan about 'baseName' (2 words ==> camelcased, e basta!) (*). And indeed names like 'fcmp' or 'fnccharmtach' should not even be allowed to live ;-). Denis (*) Except when the first part is a prefix / preposition, like input, output, subtype, supertype, transcode... in which case the name is or would be a single word in regular English. -- _________________ vita es estrany spir.wikidot.com
Mar 04 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday 04 March 2011 00:33:58 Lars T. Kyllingstad wrote:
 On Thu, 03 Mar 2011 10:39:38 -0800, Jonathan M Davis wrote:
 1. They should properly camelcased. fcmp, fnccharmtach, and fnmatch are
 probably okay, but basename should definitely be baseName.
We probably couldn't disagree more. :) I think fncharmatch is a horrible name. On the other hand, basename() is named after the 'basename' UNIX utility, and doesn't mean anything on its own. At least, I've never heard of such a thing as the "base name" of a file, but please prove me wrong.
I have no problem with finding better names than those. I was more saying that the names that they have shouldn't be camelcased. They'd be absolutely hideous if they were. As for basename, I'd argue baseName because it's properly camelcased that way (and therefore follows Phobos' general naming conventions). I don't find the fact that there's a unix utility by that name to be particularly relevant to the casing of the name. But regardless, the concept of a file's "base name" is quite clear, even if you've never heard of the unix utility efore.
 2. Please shorten the Separator in the names to Sep (i.e. dirSep,
 pathSep, and isDirSep). They're just as clear that way and less
 annoyingly long. Similarly, I'd rename currentDirSymbol to currDirSymbol
 - or maybe even have currDirSym and parentDirSym.
 
 3. I'd prefer dirName to directory. It's shorter, closer to what was
 there before, and a better name IMHO. directory makes me wonder if it's
 checking whether the name is a directory or not (which is what
 std.file.isDir does).
 
 4. It might be better to short extension/Extension to ext/Ext, but that
 works better with functions like stripExtension (stripExt), then it
 would extension (ext) by itself, and if we wanted complete consistency,
 then changing Extension to Ext would mean changing extension to ext,
 which wouldn't really be desirable. I'd still be very tempted to rename
 the xExtension functions to xExt though. Extension is unnecessarily
 long.
I have a preference for the longer names, but not a very strong one. I'm not going to oppose the changes if others agree with you.
I definitely like descriptive names, and my function names are often long, but I do tend to find that long names can get annoying - especially if you have to use them often. So, I think that you should generally choose shorter names as long as they are appropriately descriptive. A name like stripExt is clear enough - especially in context - to work quite well, so the longer name stripExtension is unnecessary, whereas ext may not be clear enough and the full name extension would probably be better.
 6. I'd strongly suggest making most of the functions properties (though
 that would require changing the examples). Functions which are nouns
 (such as drive or extension) really should be properties, otherwise they
 shouldn't have names which are nouns. basename/baseName is a funny one
 though since it's a noun and really should be a property, but it does
 have a version which takes an extra parameter, so I'm not sure what to
 do with that one. Unfortunately, for some reason, at the moment you
 can't overload property function with a non-property function (I keep
 meaning to open an enhancement request on that).
Also a good point. Not only that, most functions should be pure safe nothrow, but I've completely forgotten to add these annotations!
Indeed. I should have noticed and mentioned the lack of pure and nothrow as well. I haven't generally messed with safe yet though, since so many critical functions in Phobos aren't safe yet, so you can't really make much safe. If you can though, it's definitely desirable.
 As far as examples go, assuming that you made it so that .bashrc is a
 file with a base name of .bashrc and no extension rather than a file
 with no base name and an extension of bashrc (I haven't looked at the
 implementation at all yet, so I don't know what you did with that), then
 you really should put it (or an example like it) in the examples.
It's in the examples for extension() and stripExtension().
Ah, so it is. I missed it. I was looking for something real like .bashrc rather than .file, and I glanced over it too quickly to notice it. - Jonathan M Davis
Mar 04 2011
prev sibling parent reply spir <denis.spir gmail.com> writes:
On 03/04/2011 12:01 PM, Jonathan M Davis wrote:
 I have a preference for the longer names, but not a very strong one.  I'm
  not going to oppose the changes if others agree with you.
I definitely like descriptive names, and my function names are often long, but I do tend to find that long names can get annoying - especially if you have to use them often. So, I think that you should generally choose shorter names as long as they are appropriately descriptive. A name like stripExt is clear enough - especially in context - to work quite well, so the longer name stripExtension is unnecessary, whereas ext may not be clear enough and the full name extension
I tend to agree with you. Especially on the point that (very) common names can be shorter. On one hand, they are more easily inderstood & memorised precisely because they are common; on the other, you get the maximum benefit in terms of user-friendliness for the same reason (that they are common). Abbreviating more rare names makes the code harder to understand for (very) few benefit. Now, is stripExt/stripExtension that common? I would say no. The day you need it, you may have to write it several times because you're dealing with a piece of code that copes with file names. Right, then, you may like it be shorter. But this "pain" will soon stop; and maybe, probably?, you won't have again to write that name for weeks or months. What do you think? Another factor is the inherent clarity of the abbreviation. 'ext' can certainly be interpreted in various ways. As you say, context helps much; but it's a decisive argument for languages in which context prefixes, such as module names, are commonly used: eg "path.stripExt(fileName)". But this is not common practice in D, thus func names need be more precise, I guess. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 04 2011
parent "Nick Sabalausky" <a a.a> writes:
"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2175.1299248868.4748.digitalmars-d puremagic.com...
 On 03/04/2011 12:01 PM, Jonathan M Davis wrote:
 I have a preference for the longer names, but not a very strong one. 
 I'm
  not going to oppose the changes if others agree with you.
I definitely like descriptive names, and my function names are often long, but I do tend to find that long names can get annoying - especially if you have to use them often. So, I think that you should generally choose shorter names as long as they are appropriately descriptive. A name like stripExt is clear enough - especially in context - to work quite well, so the longer name stripExtension is unnecessary, whereas ext may not be clear enough and the full name extension
I tend to agree with you. Especially on the point that (very) common names can be shorter. On one hand, they are more easily inderstood & memorised precisely because they are common; on the other, you get the maximum benefit in terms of user-friendliness for the same reason (that they are common). Abbreviating more rare names makes the code harder to understand for (very) few benefit. Now, is stripExt/stripExtension that common? I would say no. The day you need it, you may have to write it several times because you're dealing with a piece of code that copes with file names. Right, then, you may like it be shorter. But this "pain" will soon stop; and maybe, probably?, you won't have again to write that name for weeks or months. What do you think? Another factor is the inherent clarity of the abbreviation. 'ext' can certainly be interpreted in various ways. As you say, context helps much; but it's a decisive argument for languages in which context prefixes, such as module names, are commonly used: eg "path.stripExt(fileName)". But this is not common practice in D, thus func names need be more precise, I guess.
Maybe it's just me having been knee-deep into the Win/MS-DOS world since well into the 8.3 days, but "ext" always instinctively means "file extension" to me. Of course, like I said, I happy with "extension" too, but just FWIW.
Mar 04 2011
prev sibling next sibling parent reply Graham St Jack <Graham.StJack internode.on.net> writes:
On 04/03/11 02:59, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
I like it. It certainly looks a lot cleaner than the current std.path. I am interested in why you chose to use templates to allow not only char, dchar and wchar arrays, but also const, mutable, and immutable. My first instinct would be to use non-templated functions that take const char[]. -- Graham St Jack
Mar 03 2011
parent reply Bekenn <leaveme alone.com> writes:
On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague. References: http://msdn.microsoft.com/en-us/library/dd317752%28v=VS.85%29.aspx http://blogs.msdn.com/b/michkap/archive/2006/10/24/867880.aspx http://blogs.msdn.com/b/michkap/archive/2006/08/22/707665.aspx http://blogs.msdn.com/b/michkap/archive/2007/05/07/2464778.aspx When I first started looking at D, I compiled the win32 example on the D web page. I noticed it used MessageBoxA, so I changed that to MessageBoxW. That generated an error, because nobody had bothered to add a MessageBoxW declaration. That was the very last time I used std.c.windows.
Mar 03 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday 03 March 2011 18:04:11 Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present.
It's not a bad thing for functions to be templatized on string type. However, I would point out that it's fairly common practice to just use string everywhere except where you need a string to be a random-access range - in which case you used dstring - or where you need to pass a string to a Windows system function, in which case you convert it to a wstring or wchar[]. The need to use wstring when using Phobos is practically non-existent. Now, if you're frequently calling Windows system functions directly, then wstring and wchar[] would be used far more frequently, but don't expect Phobos to be using wstring much. It's likely that more of Phobos will become string-type-agnostic and templatize on string type, but there may be functions which aren't simply due to the increased complexity or because no one gets around to it with everything else that needs doing. The normal string type to use is string, so that's generally what is designed for.
 As a Windows developer, I prefer to use wchar strings by default and use
 only the W versions of the Windows API functions, because the A versions
 severely limit functionality.  Only the W versions have full support for
 Unicode; the A versions are entirely dependent on the current (8-bit)
 code page.  This means no support for UNC paths or paths longer than 260
 characters, and also means that international characters commonly end up
 completely garbled.  Good practice in Windows is to consider the A
 versions deprecated and avoid them like the plague.
 
 References:
 	http://msdn.microsoft.com/en-us/library/dd317752%28v=VS.85%29.aspx
 	http://blogs.msdn.com/b/michkap/archive/2006/10/24/867880.aspx
 	http://blogs.msdn.com/b/michkap/archive/2006/08/22/707665.aspx
 	http://blogs.msdn.com/b/michkap/archive/2007/05/07/2464778.aspx
 
 When I first started looking at D, I compiled the win32 example on the D
 web page.  I noticed it used MessageBoxA, so I changed that to
 MessageBoxW.  That generated an error, because nobody had bothered to
 add a MessageBoxW declaration.  That was the very last time I used
 std.c.windows.
If there are key functions like that that you expect to be druntime (in this case, it would be core.sys.windows.windows - I believe that std.c is deprecated, or at least it's not intended to be used anymore; OS-specific functions like that go in core), then open up enhancement requests for them, or they're unlikely to be added. I don't believe that anyone is going through the entirety of the Windows API (or the Posix API for that matter) and adding all those functions to druntime. They generally get added as required by Phobos or other stuff in druntime or because someone requests it. Not to mention, you can always use github to make the appropriate changes to druntime and create pull request. That would likely speed up such improvements. - Jonathan M Davis
Mar 03 2011
prev sibling parent reply Graham St Jack <Graham.StJack internode.on.net> writes:
On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take 
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable. It would be even better to use const scope char[], declaring that a reference won't be kept, but it seems that scope in this context is deprecated. Once upon a time "in" meant const scope. Does anyone know what it means now? -- Graham St Jack
Mar 03 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday 03 March 2011 19:23:33 Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable. It would be even better to use const scope char[], declaring that a reference won't be kept, but it seems that scope in this context is deprecated. Once upon a time "in" meant const scope. Does anyone know what it means now?
That's still what it means. scope in this context is _not_ deprecated. Only scoped local variables (not scoped parameters or scope statements) are deprecated. in would be the correct thing to use. It's used elswhere with strings is Phobos. And yes, as long as the strings being passed in are not being mutated, then having the parameters be in is the correct thing to do. - Jonathan M Davis
Mar 03 2011
parent Bekenn <leaveme alone.com> writes:
On 3/3/2011 10:17 PM, Jonathan M Davis wrote:
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?
That's still what it means. scope in this context is _not_ deprecated.
Oh, hey, I didn't know that. Even better. Thanks!
Mar 03 2011
prev sibling next sibling parent Bekenn <leaveme alone.com> writes:
On 3/3/2011 7:23 PM, Graham St Jack wrote:
 Ok, I don't mind supporting wchar and dchar in addition to char,
 especially if Windows insists on using them.

 My main issue here is with the constness of the parameters. I think the
 correct parameter to pass is const C[]. This has the advantages of:
 * Accepting both mutable and immutable data.
 * Declares that the function won't mutate the data.
 * Declares that the function doesn't expect the data to be immutable.
Agreed; I think I might modify that slightly to "in" instead of "const", but it means the exact same thing.
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?
"in" is a synonym for a non-parenthesized const.
Mar 03 2011
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 04 Mar 2011 13:53:33 +1030, Graham St Jack wrote:

 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable.
The problem is that the functions return slices of their input argument, which means that the constancy of the input argument gets transferred to the return value. Here's an example to illustrate: C[] first(C)(const C[] s) { return s[0 .. 1]; } char[] a = "hello".dup; auto b = first(a); Try to compile this, and you get the error message Error: cannot implicitly convert expression (s[0u..1u]) of type const(char[]) to char[] The correct thing to do is to use inout, like this: inout(C)[] basename(C)(inout(C)[] path) { ... } I templated the functions on character type rather than string type exactly because I plan to do this. Unfortunately, it's not possible at the moment, because inout isn't properly implemented yet: http://d.puremagic.com/issues/show_bug.cgi?id=3748 Note that the functions which do not return slices of their input, such as setExtension(), joinPath(), etc., all have input parameters that are properly marked with 'in'.
 It would be even better to use const scope char[], declaring that a
 reference won't be kept, but it seems that scope in this context is
 deprecated.
 
 Once upon a time "in" meant const scope. Does anyone know what it means
 now?
Still does. -Lars
Mar 04 2011
parent reply spir <denis.spir gmail.com> writes:
On 03/04/2011 09:15 AM, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 13:53:33 +1030, Graham St Jack wrote:

 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable.
The problem is that the functions return slices of their input argument, which means that the constancy of the input argument gets transferred to the return value. Here's an example to illustrate: C[] first(C)(const C[] s) { return s[0 .. 1]; } char[] a = "hello".dup; auto b = first(a); Try to compile this, and you get the error message Error: cannot implicitly convert expression (s[0u..1u]) of type const(char[]) to char[]
IIUC, this means const should never be used on input parameters. Instead of meaning what the func will (not) do with its param(s), it imposes undue requirements on the outside world. Or do I miss something? From my point of view, qualifiers inside a function's interface should only describe the function behaviour. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 04 2011
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 04 Mar 2011 09:33:04 +0100, spir wrote:

 On 03/04/2011 09:15 AM, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 13:53:33 +1030, Graham St Jack wrote:

 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable.
The problem is that the functions return slices of their input argument, which means that the constancy of the input argument gets transferred to the return value. Here's an example to illustrate: C[] first(C)(const C[] s) { return s[0 .. 1]; } char[] a = "hello".dup; auto b = first(a); Try to compile this, and you get the error message Error: cannot implicitly convert expression (s[0u..1u]) of type const(char[]) to char[]
IIUC, this means const should never be used on input parameters. Instead of meaning what the func will (not) do with its param(s), it imposes undue requirements on the outside world. Or do I miss something? From my point of view, qualifiers inside a function's interface should only describe the function behaviour.
It should not be used if the function's return value is an alias of an input parameter. That's what inout is for. In all other cases, const is fine. http://www.digitalmars.com/d/2.0/function.html#inout-functions -Lars
Mar 04 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 03/04/2011 04:23 AM, Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable. It would be even better to use const scope char[], declaring that a reference won't be kept, but it seems that scope in this context is deprecated. Once upon a time "in" meant const scope. Does anyone know what it means now?
AFAIK not only 'in' is still const scope, but it precisely means what your param is: plain input. (I would love params to be 'ini' by default.) Denis -- _________________ vita es estrany spir.wikidot.com
Mar 04 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 03/04/2011 07:17 AM, Jonathan M Davis wrote:
 On Thursday 03 March 2011 19:23:33 Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable. It would be even better to use const scope char[], declaring that a reference won't be kept, but it seems that scope in this context is deprecated. Once upon a time "in" meant const scope. Does anyone know what it means now?
That's still what it means. scope in this context is _not_ deprecated. Only scoped local variables (not scoped parameters or scope statements) are deprecated. in would be the correct thing to use. It's used elswhere with strings is Phobos. And yes, as long as the strings being passed in are not being mutated, then having the parameters be in is the correct thing to do.
What about 'in' as default? I think a function changing its params is a special case --and somewhat unsafe-- which should be clearly indicated at the interface level. void decode (S,T) (S source, mutable T target) {...} unchanged ---------------------^ Denis -- _________________ vita es estrany spir.wikidot.com
Mar 04 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 03/03/2011 05:29 PM, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
Looks very good. Including doc. A real pleasure to explore :-) Jonathan: "I'd prefer dirName to directory." Agreed. (The element in question is a name, not a piece of data modelling directory.) [that's ~ all what I would criticize ;-)] Denis -- _________________ vita es estrany spir.wikidot.com
Mar 03 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 04 Mar 2011 00:42:53 +0100, spir wrote:

 On 03/03/2011 05:29 PM, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
Looks very good. Including doc. A real pleasure to explore :-)
Thanks!
 Jonathan: "I'd prefer dirName to directory." Agreed. (The element in
 question is a name, not a piece of data modelling directory.)
 [that's ~ all what I would criticize ;-)]
One more vote for dirName() has been noted. :) -Lars
Mar 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)
Meh. Since we have basename which is a replica of the homonym Unix command, I think dirname (with that exact spelling) would be most appropriate there. Andrei
Mar 04 2011
parent reply David Nadlinger <see klickverbot.at> writes:
On 3/4/11 4:10 PM, Andrei Alexandrescu wrote:
 On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)
Meh. Since we have basename which is a replica of the homonym Unix command, I think dirname (with that exact spelling) would be most appropriate there.
I must admit that I don't quite remember the results of the previous naming convention discussion regarding function names »imported« from other languages/systems, but my position hasn't changed since then: Just go with the D naming convention to make the name easy to remember/guess for _D programmers_, keeping in mind that DMD has a spell-checking feature to assist people new to D. I'd also prefer dirName for clarity, by the way, but I don't think longer names are worse generally. David
Mar 04 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday 04 March 2011 07:42:38 David Nadlinger wrote:
 On 3/4/11 4:10 PM, Andrei Alexandrescu wrote:
 On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)
=20 Meh. Since we have basename which is a replica of the homonym Unix command, I think dirname (with that exact spelling) would be most appropriate there.
=20 I must admit that I don't quite remember the results of the previous naming convention discussion regarding function names =C2=BBimported=C2=
=AB from
 other languages/systems, but my position hasn't changed since then: Just
 go with the D naming convention to make the name easy to remember/guess
 for _D programmers_, keeping in mind that DMD has a spell-checking
 feature to assist people new to D.
=20
 I'd also prefer dirName for clarity, by the way, but I don't think
 longer names are worse generally.
The general consensus of the previous discussion was that we would stick to= =20 camelcase regardless of where the original name came from. People generally= =20 considered it harder to remember the names if you had to remember that they= had=20 screwy casing. =2D Jonathan M Davis
Mar 04 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 03/04/2011 04:42 PM, David Nadlinger wrote:
 On 3/4/11 4:10 PM, Andrei Alexandrescu wrote:
 On 3/4/11 2:35 AM, Lars T. Kyllingstad wrote:
 One more vote for dirName() has been noted. :)
Meh. Since we have basename which is a replica of the homonym Unix command, I think dirname (with that exact spelling) would be most appropriate there.
I must admit that I don't quite remember the results of the previous naming convention discussion regarding function names »imported« from other languages/systems, but my position hasn't changed since then: Just go with the D naming convention to make the name easy to remember/guess for _D programmers_, keeping in mind that DMD has a spell-checking feature to assist people new to D.
Yes; if keep on adopting names that don't follow D's convention under the pretext they exist somewhere, then let's just throw the convention to the garbage, stop talking about such topics, and let everyone (first lib authors) use whatever they find nice. "patchwork lexicon" Denis -- _________________ vita es estrany spir.wikidot.com
Mar 04 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 03/03/2011 05:29 PM, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
Looks very good. Including doc. A real pleasure to explore :-) Jonathan: "I'd prefer dirName to directory." Agreed. (The element in question is a name, not a piece of data modelling directory.) [that's ~ all what I would criticize ;-)] Denis -- _________________ vita es estrany spir.wikidot.com
Mar 03 2011
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I'm certainly all in favor of this (being the one that started the "std.path.getName(): Screwy by design?" thread in the first place ;) ). It's a huge improvement over the current std.path. My (updated) comments: Names: - Given the choice between "*Separator" and "*Sep", I would lean towards "*Sep". But I'd be perfectly happy either way since I'll probably never use either of them (Like I said in the other thread, I'd rather just use forward-slash everywhere and convert to backslashes (probably via toCanonical) as-needed. So much less messy that way.) - Regarding any other abbreviation like "extension" vs "ext" or "directory" vs "dir": I like them either way. It's all good. - Need to change "basename" to "baseName". I'd actually be happier with "fileNameOf", but "baseName" is fine, too. No objection, as long as it's camel-cased. - fcmp, fncharmatch and fnmatch are pretty awful names. If I just look at "fcmp" I can't tell what the hell the "f" means. Compare floating-point numbers? Doesn't remotely scream "compare file names". The other two are worse: "fn" keeps telling my brain "function" every time I look at it even though I know full well I'm looking at std.path. I don't really care much about the exact final name, but as examples, anything along these lines would be good: fcmp -> cmpFileName fnmatch -> matchPath, matchGlob or maybe matchFileName fncharmatch -> matchPathChar, or maybe matchFileNameChar or matchFileChar - Regarding "directory", I disagree with Jonathan that it sounds like it checks if it's a valid directory. I would *strongly expect* a function like that to be named "isDirectory" or "isDir". I'm happy with the name "directory" and prefer it to his suggestion of "dirName", *but* I would not object to "dirName" (as long as "drive" is changed to "driveName" for consistency). Something like "directoryOf" or "dirNameOf" might be even better still. I'd be happy with any of the above though. Functionality: - toCanonical needs to expand the tilde "~" path. If it already does, the docs should mention this. The tilde needs to be thought out more: - Is "~/foo" a relative path or absolute? Either way, it should be documented. If it's relative, then toAbsolute needs to expand it (and be documented as such). - Maybe it should be "expandHomeDir" or "expandHome" instead of "expandTilde"? - Windows *does* have a concept of a home dir, so maybe tilde should be expanded even on Windows. Only problem though is that Windows has *two* main home dirs for each user: %HOMEPATH% for user-created files and %APPDATA% for application data. (And some others, but I don't think any of the others are appropriate for "~") So maybe there should be these three: 1. expandTilde: Exactly as it is now: expands ~ on posix, no-op on windows. 2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the user's home directory. On windows: Expands "~" and "%HOMEDIR%" to whatever %HOMEDIR% is set to. 3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to the user's home directory. On windows: Expands "~" and "%APPDATA%" to whatever %APPDATA% is set to. - Speaking of %HOMEDIR% and %APPDIR%, there should be a function that expands all of these: http://technet.microsoft.com/en-us/library/cc749104(WS.10).aspx Although I think those are all environment vars, so a function that just expands env vars might be good enough. Either way, it should definitely be called by toCanonical (and documented as such). Special thought would have to be given to how to handle this cross-platform. Maybe any ones with obvious equivalents on posix (like %HOMEDIR%, %APPDIR%, and %TEMP%) should be converted appropriately on posix. Maybe posix uses a different delimiter, and if so, how to handle the each delimiter on each platform should be thought out. That's all I can think of right now. Of course if your proposed module becomes the new std.path just as it is and the above improvements wait for another update, I'd still be happy. Heck, again, your module is a *huge* improvement over the current std.path even just as it is.
Mar 03 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 03 Mar 2011 22:51:01 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I'm certainly all in favor of this (being the one that started the "std.path.getName(): Screwy by design?" thread in the first place ;) ). It's a huge improvement over the current std.path.
Thanks!
 My (updated) comments:
 
 Names:
 - Given the choice between "*Separator" and "*Sep", I would lean towards
 "*Sep". But I'd be perfectly happy either way since I'll probably never
 use either of them (Like I said in the other thread, I'd rather just use
 forward-slash everywhere and convert to backslashes (probably via
 toCanonical) as-needed. So much less messy that way.)
 
 - Regarding any other abbreviation like "extension" vs "ext" or
 "directory" vs "dir": I like them either way. It's all good.
 
 - Need to change "basename" to "baseName". I'd actually be happier with
 "fileNameOf", but "baseName" is fine, too. No objection, as long as it's
 camel-cased.
See my comments to Jonathan's post.
 - fcmp, fncharmatch and fnmatch are pretty awful names. If I just look
 at "fcmp" I can't tell what the hell the "f" means. Compare
 floating-point numbers? Doesn't remotely scream "compare file names". 
 The other two are worse: "fn" keeps telling my brain "function" every
 time I look at it even though I know full well I'm looking at std.path.
 I don't really care much about the exact final name, but as examples,
 anything along these lines would be good:
     fcmp -> cmpFileName
     fnmatch -> matchPath, matchGlob or maybe matchFileName fncharmatch
     -> matchPathChar, or maybe matchFileNameChar or
 matchFileChar
I agree the names are terrible, and I like your suggestions. I suggest we go with cmpPath, matchPath and matchPathChar.
 - Regarding "directory", I disagree with Jonathan that it sounds like it
 checks if it's a valid directory. I would *strongly expect* a function
 like that to be named "isDirectory" or "isDir". I'm happy with the name
 "directory" and prefer it to his suggestion of "dirName", *but* I would
 not object to "dirName" (as long as "drive" is changed to "driveName"
 for consistency). Something like "directoryOf" or "dirNameOf" might be
 even better still. I'd be happy with any of the above though.
I prefer "directory" over "dirName" too, but I can live with the latter. You'll have a hard time convincing me to use "directoryOf" or "dirNameOf", though. ;)
 Functionality:
 - toCanonical needs to expand the tilde "~" path. If it already does,
 the docs should mention this.
I hadn't thought of this, good thing you brought it up. One thing which should perhaps be taken into consideration is that toCanonical() as it is now is a rather simple in-memory string operation. expandTilde(), on the other hand, does disk I/O in some cases (it does a /etc/passwd lookup). I'll have to think some more about this.
 The tilde needs to be thought out more: - Is "~/foo" a relative path or
 absolute? Either way, it should be documented. If it's relative, then
 toAbsolute needs to expand it (and be documented as such).
Good point. It's definitely an absolute path, so I'll need to include this case in isAbsolute(). It is of course technically possible to set $HOME to a relative path, but it is quite pointless and not something we should worry about.
 - Maybe it should be "expandHomeDir" or "expandHome" instead of
 "expandTilde"?
 
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:
 
         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.
 
         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.
 
         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.
On POSIX you expect to be able to use ~ anywhere you're asked to input a path/filename. Is this the case on Windows? Can you write %HOMEDIR% \report.doc in Word's "Open" dialog, for instance?
 - Speaking of %HOMEDIR% and %APPDIR%, there should be a function that
 expands all of these:
 http://technet.microsoft.com/en-us/library/cc749104(WS.10).aspx  
 Although I think those are all environment vars, so a function that just
 expands env vars might be good enough. Either way, it should definitely
 be called by toCanonical (and documented as such). Special thought would
 have to be given to how to handle this cross-platform. Maybe any ones
 with obvious equivalents on posix (like %HOMEDIR%, %APPDIR%, and %TEMP%)
 should be converted appropriately on posix. Maybe posix uses a different
 delimiter, and if so, how to handle the each delimiter on each platform
 should be thought out.
I agree a function that expands environment variables could be useful, but I don't think std.path is the right place for it. Perhaps an "expand" member function of std.process.environment?
 That's all I can think of right now.
Thanks for the feedback!
 Of course if your proposed module becomes the new std.path just as it is
 and the above improvements wait for another update, I'd still be happy.
 Heck, again, your module is a *huge* improvement over the current
 std.path even just as it is.
:) -Lars
Mar 04 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikqabr$796$4 digitalmars.com...
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:

         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.

         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.

         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.
On POSIX you expect to be able to use ~ anywhere you're asked to input a path/filename. Is this the case on Windows? Can you write %HOMEDIR% \report.doc in Word's "Open" dialog, for instance?
No, it's just an environment variable. In fact, it seems that % is a valid filename character (I wouldn't have even guessed that), so expanding any of the %BLAH% stuff in std.path is probably a bad idea after all. The expandTilde/expandHomeDir/expandAppDataDir working on *just* tilde might be a good idea though. Although maybe it would be better to just have expandTilde and then add these two functions instead: - getHomeDir(): Posix: Returns expanded form of "~". Widnows: Returns expanded form of "%HOMEDIR%" - getAppDataDir(): Posix: Returns expanded form of "~". Widnows: Returns expanded form of "%APPDATA%"
Mar 04 2011
parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Fri, 04 Mar 2011 10:13:04 -0000, Nick Sabalausky <a a.a> wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikqabr$796$4 digitalmars.com...
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:

         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.

         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.

         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.
On POSIX you expect to be able to use ~ anywhere you're asked to input a path/filename. Is this the case on Windows? Can you write %HOMEDIR% \report.doc in Word's "Open" dialog, for instance?
No, it's just an environment variable.
Actually, you can. I just tried Textpad and Word 2010 and both accepted me typing: %HOMEDRIVE%%HOMEPATH%\ (at this point they both bring up suggestions) %APPDATA%\ (at this point they both bring up suggestions) FYI.. my environment variables are: APPDATA=C:\Users\rheath.<domain>\AppData\Roaming HOMEDRIVE=C: HOMEPATH=\Users\rheath.<domain> I don't have HOMEDIR, .. this is on Windows 7 x64 BTW.
 In fact, it seems that % is a valid
 filename character (I wouldn't have even guessed that), so expanding any  
 of
 the %BLAH% stuff in std.path is probably a bad idea after all.
Not necessarily, but it might require a bit more double-checking, for example.. If you type the following at command prompt you get an error. copy con test%HOMEDRIVE%.txt "The filename, directory name, or volume label syntax is incorrect." Because %HOMEDRIVE% is expanded to C: and testC:.txt is invalid. But these both work: copy con test%HOMEDRIVE.txt (missing 2nd %) copy con test%HOMEDRIV%.txt (non-existant envvar) In the latter case you actually get a file named "test%HOMEDRIV%.txt", it hasn't attempted to replace the non-existant envvar with a blank string, as that would result in "test.txt". R p.s. "copy con" means copy console, type something, then press ctrl+z to mark EOF. -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Mar 04 2011
parent "Nick Sabalausky" <a a.a> writes:
"Regan Heath" <regan netmail.co.nz> wrote in message 
news:op.vrtj9iz454xghj puck.auriga.bhead.co.uk...
 On Fri, 04 Mar 2011 10:13:04 -0000, Nick Sabalausky <a a.a> wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikqabr$796$4 digitalmars.com...
 - Windows *does* have a concept of a home dir, so maybe tilde should be
 expanded even on Windows. Only problem though is that Windows has *two*
 main home dirs for each user: %HOMEPATH% for user-created files and
 %APPDATA% for application data. (And some others, but I don't think any
 of the others are appropriate for "~") So maybe there should be these
 three:

         1. expandTilde: Exactly as it is now: expands ~ on posix, no-op
         on
 windows.

         2. expandHomeDir: On posix: Expands "~" and "%HOMEDIR%" to the
 user's home directory. On windows: Expands "~" and "%HOMEDIR%" to
 whatever %HOMEDIR% is set to.

         3. expandAppDataDir: On posix: Expands "~" and "%APPDATA%" to
         the
 user's home directory. On windows: Expands "~" and "%APPDATA%" to
 whatever %APPDATA% is set to.
On POSIX you expect to be able to use ~ anywhere you're asked to input a path/filename. Is this the case on Windows? Can you write %HOMEDIR% \report.doc in Word's "Open" dialog, for instance?
No, it's just an environment variable.
Actually, you can. I just tried Textpad and Word 2010 and both accepted me typing: %HOMEDRIVE%%HOMEPATH%\ (at this point they both bring up suggestions) %APPDATA%\ (at this point they both bring up suggestions) FYI.. my environment variables are: APPDATA=C:\Users\rheath.<domain>\AppData\Roaming HOMEDRIVE=C: HOMEPATH=\Users\rheath.<domain> I don't have HOMEDIR, .. this is on Windows 7 x64 BTW.
Oh, you're right. It's the same for me on XP. I must have misread the doc page: There is no %HOMEDIR%, that's why it didn't work when I tried it in notepad. The correct thing is %HOMEDRIVE%%HOMEPATH%. That works for me, and so does %APPDATA%.
 In fact, it seems that % is a valid
 filename character (I wouldn't have even guessed that), so expanding any 
 of
 the %BLAH% stuff in std.path is probably a bad idea after all.
Not necessarily, but it might require a bit more double-checking, for example.. If you type the following at command prompt you get an error. copy con test%HOMEDRIVE%.txt "The filename, directory name, or volume label syntax is incorrect." Because %HOMEDRIVE% is expanded to C: and testC:.txt is invalid. But these both work: copy con test%HOMEDRIVE.txt (missing 2nd %) copy con test%HOMEDRIV%.txt (non-existant envvar) In the latter case you actually get a file named "test%HOMEDRIV%.txt", it hasn't attempted to replace the non-existant envvar with a blank string, as that would result in "test.txt".
FWIW, I just did a little test to see if the substitution is being done by the commandline itself or by the command being run. Seems to be the commandline itself doing the substitution. Make a little echo program in D: import std.stdio; void main(string[] args) { writeln(args[1]); }
dmd main.d
main %APPDATA%
C:\Documents and Settings\Nick Sabalausky\Application Data Next thing to test would be the file I/O API. I'm wondering if passing "%APPDATA%" directly to the file I/O routines would be taken literally or get automatically expanded. I would think it would be taken literally, but with all the magic that windows does, I'm not so sure. Don't have time to test it at the moment.
Mar 04 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-03-03 17:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
How about functions for getting common directories like the home and temp directory. -- /Jacob Carlborg
Mar 04 2011
parent reply J Chapman <j ch.com> writes:
== Quote from Jacob Carlborg (doob me.com)'s article
 On 2011-03-03 17:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
How about functions for getting common directories like the home and temp directory.
They'd belong in a separate module - std.environment?
Mar 04 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-03-04 11:31, J Chapman wrote:
 == Quote from Jacob Carlborg (doob me.com)'s article
 On 2011-03-03 17:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

       http://kyllingen.net/code/ltk/doc/path.html
       https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
How about functions for getting common directories like the home and temp directory.
They'd belong in a separate module - std.environment?
Yeah, that might be a better idea. -- /Jacob Carlborg
Mar 04 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday 04 March 2011 00:25:31 spir wrote:
 On 03/04/2011 07:17 AM, Jonathan M Davis wrote:
 On Thursday 03 March 2011 19:23:33 Graham St Jack wrote:
 On 04/03/11 12:34, Bekenn wrote:
 On 3/3/11 3:30 PM, Graham St Jack wrote:
 My first instinct would be to use non-templated functions that take
 const
 char[].
Please don't ever restrict encodings like that. As much as possible, libraries should seek to be encoding agnostic (though I'm all for const-qualifying parameters). This is one area where I feel the standard library severely lacks at present. As a Windows developer, I prefer to use wchar strings by default and use only the W versions of the Windows API functions, because the A versions severely limit functionality. Only the W versions have full support for Unicode; the A versions are entirely dependent on the current (8-bit) code page. This means no support for UNC paths or paths longer than 260 characters, and also means that international characters commonly end up completely garbled. Good practice in Windows is to consider the A versions deprecated and avoid them like the plague.
Ok, I don't mind supporting wchar and dchar in addition to char, especially if Windows insists on using them. My main issue here is with the constness of the parameters. I think the correct parameter to pass is const C[]. This has the advantages of: * Accepting both mutable and immutable data. * Declares that the function won't mutate the data. * Declares that the function doesn't expect the data to be immutable. It would be even better to use const scope char[], declaring that a reference won't be kept, but it seems that scope in this context is deprecated. Once upon a time "in" meant const scope. Does anyone know what it means now?
That's still what it means. scope in this context is _not_ deprecated. Only scoped local variables (not scoped parameters or scope statements) are deprecated. in would be the correct thing to use. It's used elswhere with strings is Phobos. And yes, as long as the strings being passed in are not being mutated, then having the parameters be in is the correct thing to do.
What about 'in' as default? I think a function changing its params is a special case --and somewhat unsafe-- which should be clearly indicated at the interface level. void decode (S,T) (S source, mutable T target) {...} unchanged ---------------------^
That's now how D works. That's not how it well ever work. Every time such an idea hos been brought up to Walter (and probably any of the major developers for that matter), it has been shot down. It's too big a departure from C, C++, Java, instead of marking stuff as const or in a decent chunk of the time, you're going to have to mark stuff as mutable a decent chunk of the time - possibly more. It would confuse most programmers to no benefit. You'd just be trading a common, known default to a uncommon, strange one (for C-based languages anyway) and changing which variables you had to mark with what. You're still going to have to mark plenty of variables as something other the default. So, no. in will never be the default. - Jonathan M Davis - Jonathan M Davis
Mar 04 2011
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Mar 04 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep * currentDirSymbol, currentDirSym, curDirSymbol * basename, baseName, filename, fileName * dirname, dirName, directory, getDir, getDirName * drivename, driveName, drive, getDrive, getDriveName * extension, ext, getExt, getExtension * stripExtension, stripExt (The same convention will be used for stripExtension, replaceExtension and defaultExtension.) -Lars
Mar 05 2011
next sibling parent reply Bekenn <leaveme alone.com> writes:
dirSeparator	-- I'd actually prefer pathSeparator, but that's not on the 
list.
currentDirSymbol
baseName
dirName
driveName
extension
stripExtension


Abbrvs impr rdblty.
Mar 05 2011
parent Jim <bitcirkel yahoo.com> writes:
Bekenn Wrote:

 dirSeparator	-- I'd actually prefer pathSeparator, but that's not on the 
 list.
 currentDirSymbol
 baseName
 dirName
 driveName
 extension
 stripExtension
++vote ...except that I like the current distinction between pathSeparator and dirSeparator as it is. pathSeparator should divide paths not directories.
Mar 05 2011
prev sibling next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:iktojn$go0$1 digitalmars.com...
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSep, But I'd be fine with the others too.
 * currentDirSymbol, currentDirSym, curDirSymbol
currDirSymbol, But I'd be fine with the others too.
 * basename, baseName, filename, fileName
baseName or baseFileName Definitely not 'filename' because I frequently use that as a variable name. Definitely not 'basename' because it's not camel-cased, and because the fact that there's a unix command named 'basename' is completely irrelevent. Patchwork naming "convention" is idiotic. And I'm uncomfortable with fileName because despite it being much more descriptive than baseName, it's too close to what I'd use as a common variable name.
 * dirname, dirName, directory, getDir, getDirName
dirName or directory. But anything except 'dirname' is fine.
 * drivename, driveName, drive, getDrive, getDriveName
driveName or drive. But anything except 'drivename' is fine.
 * extension, ext, getExt, getExtension
ext. But the others are fine, too.
 * stripExtension, stripExt
stripExt, But either one is fine. Well now everyone, I think that I would have to have to say to all of the people here in this newsgroup that excess verbosity can and does (and would continue to) harm readability every last bit as much as having 2 mny abbrs wuld harm the readability of the name of a variable, or a function or really any other custom-named identifier that may or may not exist in D, or in Phobos, or in any code written in D, or really any other langauge regardless if it happens to be a programming language or some other sort of a language such as a human language.
Mar 05 2011
parent reply spir <denis.spir gmail.com> writes:
On 03/05/2011 09:57 PM, Nick Sabalausky wrote:
 * currentDirSymbol, currentDirSym, curDirSymbol
currDirSymbol, But I'd be fine with the others too.
"currDirSymbol" not on the list ;-) Denis -- _________________ vita es estrany spir.wikidot.com
Mar 05 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2213.1299361218.4748.digitalmars-d puremagic.com...
 On 03/05/2011 09:57 PM, Nick Sabalausky wrote:
 * currentDirSymbol, currentDirSym, curDirSymbol
currDirSymbol, But I'd be fine with the others too.
"currDirSymbol" not on the list ;-)
I deliberately added it :) I think it's better than "curDirSymbol" (but like I said, I can go either way.)
Mar 05 2011
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Without even looking at any posts in this discussion, what is a
directory *symbol* anyway?
Mar 05 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 05 March 2011 17:22:01 Andrej Mitrovic wrote:
 Without even looking at any posts in this discussion, what is a
 directory *symbol* anyway?
currDirSym would be ".", and parentDirSym would "..". It's what you use when navigating directories backwards. It's quite clear if you look at the docs. - Jonathan M Davis
Mar 05 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I dunno, maybe I'd prefer an enum.

enum path : string { current = ".", up = ".." };

main() { string newPath = join("C:", "Windows", "Subdir", path.up,
path.up, "Program Files");
newPath == r"C:\Windows\Subdir\..\..\Program Files";

This is just nitpicking however. And 'current' is only used on Linux afaik? :)

On 3/6/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 On Saturday 05 March 2011 17:22:01 Andrej Mitrovic wrote:
 Without even looking at any posts in this discussion, what is a
 directory *symbol* anyway?
currDirSym would be ".", and parentDirSym would "..". It's what you use when navigating directories backwards. It's quite clear if you look at the docs. - Jonathan M Davis
Mar 05 2011
parent "Nick Sabalausky" <a a.a> writes:
"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.2230.1299375838.4748.digitalmars-d puremagic.com...
I dunno, maybe I'd prefer an enum.

 enum path : string { current = ".", up = ".." };

 main() { string newPath = join("C:", "Windows", "Subdir", path.up,
 path.up, "Program Files");
 newPath == r"C:\Windows\Subdir\..\..\Program Files";

 This is just nitpicking however. And 'current' is only used on Linux 
 afaik? :)
Windows has always had the '.' meaning "current directory". Even early versions of MS-DOS had it.
Mar 05 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 03/06/2011 01:35 AM, Nick Sabalausky wrote:
 "spir"<denis.spir gmail.com>  wrote in message
 news:mailman.2213.1299361218.4748.digitalmars-d puremagic.com...
 On 03/05/2011 09:57 PM, Nick Sabalausky wrote:
 * currentDirSymbol, currentDirSym, curDirSymbol
currDirSymbol, But I'd be fine with the others too.
"currDirSymbol" not on the list ;-)
I deliberately added it :) I think it's better than "curDirSymbol" (but like I said, I can go either way.)
I agree with you and Jonathan about that point. Also find that 'dir' is enough, esp in context, because it can hardly be misinterpreted. And it's very used in programming (not only a pair of PLs) and in computer use in general. Thus, it's one rare case where I find abbr ok. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 03/05/2011 05:32 PM, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad"<public kyllingen.NOSPAMnet>  wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSep
   * currentDirSymbol, currentDirSym, curDirSymbol
currentDirSymbol
   * basename, baseName, filename, fileName
baseName, fileName
   * dirname, dirName, directory, getDir, getDirName
dirName, getDirName
   * drivename, driveName, drive, getDrive, getDriveName
driveName, getDriveName
   * extension, ext, getExt, getExtension
   * stripExtension, stripExt
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)
don't mind About "xyz" vs "xyzName": the point is what is denoted /is/ a name. It's not a programming object modelling a directory or a drive.
 -Lars
Denis -- _________________ vita es estrany spir.wikidot.com
Mar 05 2011
prev sibling next sibling parent reply J Chapman <johnch_atms hotmail.com> writes:
== Quote from Lars T. Kyllingstad (public kyllingen.NOSPAMnet)'s article
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSeparator
  * currentDirSymbol, currentDirSym, curDirSymbol
currentDirSymbol
  * basename, baseName, filename, fileName
baseName (but prefer getBaseName for consistency with below)
  * dirname, dirName, directory, getDir, getDirName
getDirName
  * drivename, driveName, drive, getDrive, getDriveName
getDriveName
  * extension, ext, getExt, getExtension
getExtension
  * stripExtension, stripExt
stripExtension (but prefer removeExtension)
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)
 -Lars
Mar 05 2011
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Please no repetitions in consonants, e.g. "curr". That's something
I'll keep screwing up when typing, and all I'll get back is "no
curDirName in main.d", or "symbol not found", etc..
Mar 05 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 05 March 2011 14:07:44 Andrej Mitrovic wrote:
 Please no repetitions in consonants, e.g. "curr". That's something
 I'll keep screwing up when typing, and all I'll get back is "no
 curDirName in main.d", or "symbol not found", etc..
LOL. Whereas I'd argue that there _should_ be a repetition in consonants if the word that's being abbreviated has a double consonant. Otherwise, it looks like it's spelled wrong, and _I_ for one would be constantly mis-typing it. However, regardless of which way it goes, I _would_ point out that you wouldn't get an error message as bad as you suggest. It should be asking you if you meant X (where X is whatever the correct spelling is) instead, unlike languages like C++ or Java normally do. - Jonathan M Davis
Mar 05 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 3/5/11, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 However, regardless of which way it goes, I _would_ point out that you
 wouldn't
 get an error message as bad as you suggest. It should be asking you if you
 meant
 X (where X is whatever the correct spelling is) instead, unlike languages
 like
 C++ or Java normally do.

 - Jonathan M Davis
Oh yeah, I forgot about DMD's semi-recent inclusion of "did you mean..?" error messages. They're actually quite useful, and I'd wish Optlink was the same; "Oh, did you mean to link in mylibrary, not my liblaly.lib?"
Mar 05 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSep and pathSep. Having Separator in the name is unnecessarily long.
  * currentDirSymbol, currentDirSym, curDirSymbol
currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if abbreviating both current and symbol is too much). Shorter but still quite clear. I would _definitely_ use two r's when abbreviating current though, since current has two r's. I confess that it' a major pet peeve of mine when I see current abbreviate with one r. It feels like it's being spelled wrong, since current has two r's.
  * basename, baseName, filename, fileName
baseName
  * dirname, dirName, directory, getDir, getDirName
dirName
  * drivename, driveName, drive, getDrive, getDriveName
driveLetter would probably be better actually - though it _could_ be more than one letter if someone has an insane number of drives (it's usually referred to as a drive letter though). Barring that, drive would be fine (as long as it's a property).
  * extension, ext, getExt, getExtension
  * stripExtension, stripExt
 
 (The same convention will be used for stripExtension, replaceExtension
 and defaultExtension.)
I'm a bit torn between extension and ext -I'd like ext but am afraid it's a bit too short for clarity. However, I _do_ think that all of the names which use Extension as a prefix should use Ext instead. It's much shorter and still quite clear. - Jonathan M Davis
Mar 05 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sat, 05 Mar 2011 14:33:07 -0800, Jonathan M Davis wrote:

 On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread,
 I started working on a rewrite of std.path a long time ago, but I
 got sidetracked by other things.  The recent discussion got me
 working on it again, and it turned out there wasn't that much left
 to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSep and pathSep. Having Separator in the name is unnecessarily long.
  * currentDirSymbol, currentDirSym, curDirSymbol
currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if abbreviating both current and symbol is too much). Shorter but still quite clear. I would _definitely_ use two r's when abbreviating current though, since current has two r's. I confess that it' a major pet peeve of mine when I see current abbreviate with one r. It feels like it's being spelled wrong, since current has two r's.
  * basename, baseName, filename, fileName
baseName
  * dirname, dirName, directory, getDir, getDirName
dirName
  * drivename, driveName, drive, getDrive, getDriveName
driveLetter would probably be better actually - though it _could_ be more than one letter if someone has an insane number of drives (it's usually referred to as a drive letter though). Barring that, drive would be fine (as long as it's a property).
Interestingly, it seems drive names are actually restricted to one letter. See the last paragraph of this section: http://en.wikipedia.org/wiki/Drive_letter#Common_assignments -Lars
Mar 06 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 04:11:35 Lars T. Kyllingstad wrote:
 On Sat, 05 Mar 2011 14:33:07 -0800, Jonathan M Davis wrote:
 On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread,
 I started working on a rewrite of std.path a long time ago, but I
 got sidetracked by other things.  The recent discussion got me
 working on it again, and it turned out there wasn't that much left
 to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSep and pathSep. Having Separator in the name is unnecessarily long.
  * currentDirSymbol, currentDirSym, curDirSymbol
currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if abbreviating both current and symbol is too much). Shorter but still quite clear. I would _definitely_ use two r's when abbreviating current though, since current has two r's. I confess that it' a major pet peeve of mine when I see current abbreviate with one r. It feels like it's being spelled wrong, since current has two r's.
  * basename, baseName, filename, fileName
baseName
  * dirname, dirName, directory, getDir, getDirName
dirName
  * drivename, driveName, drive, getDrive, getDriveName
driveLetter would probably be better actually - though it _could_ be more than one letter if someone has an insane number of drives (it's usually referred to as a drive letter though). Barring that, drive would be fine (as long as it's a property).
Interestingly, it seems drive names are actually restricted to one letter. See the last paragraph of this section: http://en.wikipedia.org/wiki/Drive_letter#Common_assignments
I could have sworn that I'd seen something which allowed you to assign two- letter names to drives instead of just one... Oh well, it's not like two-letter drive names would be common anyway. That just seems like driveLetter is that much better a name though - especially since driveLetter is unambiguously a Windows thing then as opposed to some general HDD thing. - Jonathan M Davis
Mar 06 2011
prev sibling parent reply Bekenn <leaveme alone.com> writes:
On 3/6/2011 4:11 AM, Lars T. Kyllingstad wrote:
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:

 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments

 -Lars
Correct. However, the rules change for UNC paths: http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx
Mar 06 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 18:46:15 Bekenn wrote:
 On 3/6/2011 4:11 AM, Lars T. Kyllingstad wrote:
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:
 
 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments
 
 -Lars
Correct. However, the rules change for UNC paths: http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx
Now, _that_ is a great link. There's lots of good information there. Thanks! - Jonathan m Davis
Mar 06 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Bekenn" <leaveme alone.com> wrote in message 
news:il1h39$19p5$2 digitalmars.com...
 On 3/6/2011 4:11 AM, Lars T. Kyllingstad wrote:
 Interestingly, it seems drive names are actually restricted to one
 letter.  See the last paragraph of this section:

 http://en.wikipedia.org/wiki/Drive_letter#Common_assignments

 -Lars
Correct. However, the rules change for UNC paths: http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx
Great link! I can't believe how much is in there that I never even had the slightest clue about. The '//?/' and '//./' are *completely* new to me, and I've been a windows guy since 3.11. I think these parts are particularly relevent to our discussion here: -------------------------------------------------- Do not end a file or directory name with a space or a period. Although the underlying file system may support such names, the Windows shell and user interface does not. However, it is acceptable to specify a period as the first character of a name. For example, ".temp". -------------------------------------------------- This implies three things: 1. The windows shell and UI are shitty 2. The windows filesystem *does* allow files that end in '.' just lke unix, despite the windows shell and UI being too stupid to handle them right. 3. *Even on windows* something that starts with a dot is to be considered a filename, not a nameless file with an extension. -------------------------------------------------- File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\\?\" prefix as detailed in the following sections. -------------------------------------------------- Ie, WinAPI automatically accepts *both* slashes and backslashes as the directory separator. Although lower-level stuff may expect backslashes. -------------------------------------------------- {almost everything else} -------------------------------------------------- Implies: 1. The ANSI/ASCII APIs should just simply *never* be used. 2. Handling all paths properly on windows is a royal fucking PITA.
Mar 07 2011
parent Bekenn <leaveme alone.com> writes:
On 3/7/2011 2:30 PM, Nick Sabalausky wrote:
 --------------------------------------------------
 {almost everything else}
 --------------------------------------------------

 Implies:

 1. The ANSI/ASCII APIs should just simply *never* be used.
This right here is something that I think needs to be drilled into every potential Windows programmer out there. The underlying file system usually encodes file names in Unicode, which provides great flexibility. The ANSI versions of Windows API functions *cannot* handle that. It is therefore impossible to guarantee that you can handle a valid Windows file path using the ANSI version of a function. ANSI versions exist /for backwards compatibility only/. New functionality is often introduced without even providing an ANSI version of the function. Just simply do not use ANSI functions.
Mar 07 2011
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sat, 05 Mar 2011 16:32:55 +0000, Lars T. Kyllingstad wrote:

 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep * currentDirSymbol, currentDirSym, curDirSymbol * basename, baseName, filename, fileName * dirname, dirName, directory, getDir, getDirName * drivename, driveName, drive, getDrive, getDriveName * extension, ext, getExt, getExtension * stripExtension, stripExt (The same convention will be used for stripExtension, replaceExtension and defaultExtension.)
In summary, it seems currentDirSymbol, baseName, dirName and driveName are clear winners. Less clear, but still voted for by the majority, are extension and stripExtension. It is a tie between dirSep and dirSeparator. Below are the votes I counted. And before you say "hey, I didn't know we could make suggestions of our own", or "why did that guy get several votes?", this was by no means a formal vote. It was just trying to get a feel for people's preferences. Before the module gets accepted into Phobos there will have to be a formal review process, so there is still a lot of opportunity to fight over naming. :) dirSep: 3 (Nick Sabalausky, spir, Jonathan M. Davis) dirSeparator: 3 (Bekenn, Jim, J Chapman) currDirSym: 1 (Jonathan M. Davis) currDirSymbol: 2 (Nick Sabalausky, Jonathan M. Davis) path.current: 1 (Andrej Mitrovic) currentDirSymbol: 4 (Bekenn, Jim, J Chapman, spir) baseName: 6 (Nick Sabalausky, Bekenn, Jim, J Chapman, spir, Jonathan M. Davis) baseFileName: 1 (Nick Sabalausky) fileName: 1 (spir) basename: 1 (Andrei Alexandrescu) dirName: 6 (Nick Sabalausky, Bekenn, Jim, spir, Jonathan M. Davis, David Nadlinger) directory: 1 (Nick Sabalausky) getDirName: 2 (J Chapman, spir) dirname: 1 (Andrei Alexandrescu) driveName: 4 (Nick Sabalausky, Bekenn, Jim, spir) drive: 2 (Nick Sabalausky, Jonathan M. Davis) getDriveName: 2 (J Chapman, spir) driveLetter: 1 (Jonathan M. Davis) ext: 1 (Nick Sabalausky) extension: 2 (Bekenn, Jim) getExtension: 1 (J Chapman) stripExt: 2 (Nick Sabalausky, Jonathan M. Davis) stripExtension: 3 (Bekenn, Jim, J Chapman)
Mar 06 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/6/11 6:31 AM, Lars T. Kyllingstad wrote:
 In summary, it seems currentDirSymbol, baseName, dirName and driveName
 are clear winners.  Less clear, but still voted for by the majority, are
 extension and stripExtension.  It is a tie between dirSep and
 dirSeparator.

 Below are the votes I counted.  And before you say "hey, I didn't know we
 could make suggestions of our own", or "why did that guy get several
 votes?", this was by no means a formal vote.  It was just trying to get a
 feel for people's preferences.  Before the module gets accepted into
 Phobos there will have to be a formal review process, so there is still a
 lot of opportunity to fight over naming. :)
I think whatever you choose will not please everybody, so just choose something and stick with it. Regarding all the extension naming stuff, I suggest you go with the "suffix" nomenclature which is more general and applicable to all OSs. Regarding semantics, consistently strip the trailing slash. It is unequivocally the best semantics (and incidentally or not it's what Unix's dirname and basename do). If rsync et al need it, they can always look for it in the initial parameter. The reality of the matter is that you will never be able to accommodate all use cases there are with maximum convenience. You may want to prepare this for review after April 1st, when the review for std.parallelism ends. There is good signal in the exchange so far, but from here on this discussion could go on forever and shift focus away from std.parallelism. Andrei
Mar 06 2011
next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 06 Mar 2011 09:29:27 -0600, Andrei Alexandrescu wrote:

 On 3/6/11 6:31 AM, Lars T. Kyllingstad wrote:
 In summary, it seems currentDirSymbol, baseName, dirName and driveName
 are clear winners.  Less clear, but still voted for by the majority,
 are extension and stripExtension.  It is a tie between dirSep and
 dirSeparator.

 Below are the votes I counted.  And before you say "hey, I didn't know
 we could make suggestions of our own", or "why did that guy get several
 votes?", this was by no means a formal vote.  It was just trying to get
 a feel for people's preferences.  Before the module gets accepted into
 Phobos there will have to be a formal review process, so there is still
 a lot of opportunity to fight over naming. :)
I think whatever you choose will not please everybody, so just choose something and stick with it. Regarding all the extension naming stuff, I suggest you go with the "suffix" nomenclature which is more general and applicable to all OSs.
I don't agree. A suffix can be anything, and we already have functions in std.algorithm, std.array and std.string to deal with the general case. Like it or not, filename extensions are still the main method for conveying file type information on Windows (and even to some extent on Linux and OSX). I think that's a good reason to include support for manipulating extensions in std.path.
 Regarding semantics, consistently strip the trailing slash. It is
 unequivocally the best semantics (and incidentally or not it's what
 Unix's dirname and basename do). If rsync et al need it, they can always
 look for it in the initial parameter. The reality of the matter is that
 you will never be able to accommodate all use cases there are with
 maximum convenience.
I agree, and that's how I've done it.
 You may want to prepare this for review after April 1st, when the review
 for std.parallelism ends. There is good signal in the exchange so far,
 but from here on this discussion could go on forever and shift focus
 away from std.parallelism.
Absolutely. This was only intended as informal discussion, and not as a start on the formal review. -Lars
Mar 06 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 07:29:27 Andrei Alexandrescu wrote:
 I think whatever you choose will not please everybody, so just choose
 something and stick with it. Regarding all the extension naming stuff, I
 suggest you go with the "suffix" nomenclature which is more general and
 applicable to all OSs.
I agree with Lars on this one. Everyone knows what an extension is. It's a universal concept even if it's not used as much on non-Windows OSes. There _are_ plenty of programs in *nix which use it internally (likely because it's a lot easier than dealing with mime type) even if they shouldn't. "suffix" instead of "extension" or "ext" would be a lot less clear to most people and add pretty much no benefit.
 You may want to prepare this for review after April 1st, when the review
 for std.parallelism ends. There is good signal in the exchange so far,
 but from here on this discussion could go on forever and shift focus
 away from std.parallelism.
I agree that we've probably gotten as much out of the discussion of std.path as we could reasonably get prior to a full review, so continuing a major discussion in this thread is likely unwarranted. However, are you indicating that we should never have more than one module in review at a time? I see some benefit in spreading them out, on the other hand, if we have multiple modules ready for review, it seems like we could be slowing down progress unnecessarily if we ruled that we could only ever have one module under review at a time. As for std.parallelism, I fear that that is the sort of module which is going to get close examination by a few people and most others will either ignore because they don't really intend to use it or because they fear that it will be too complicated to look at and review (especially if they're not all that well- versed in threading). So, I'm not sure how much of an in-depth examination it's going to get by the group at large. Which reminds me, I still need to go check it out... - Jonathan M Davis
Mar 06 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
 However, are you indicating that we should
 never have more than one module in review at a time? I see some benefit in
 spreading them out, on the other hand, if we have multiple modules ready for
 review, it seems like we could be slowing down progress unnecessarily if we
 ruled that we could only ever have one module under review at a time.
We should have only one review at a time. That way each review will be thorough. Boost does that, and I don't want to mess with success - particularly since the Boost community is larger too. Andrei
Mar 06 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 17:35:32 Andrei Alexandrescu wrote:
 However, are you indicating that we should
 never have more than one module in review at a time? I see some benefit
 in spreading them out, on the other hand, if we have multiple modules
 ready for review, it seems like we could be slowing down progress
 unnecessarily if we ruled that we could only ever have one module under
 review at a time.
We should have only one review at a time. That way each review will be thorough. Boost does that, and I don't want to mess with success - particularly since the Boost community is larger too.
In the general case, that seems like a good idea. I just don't want to get in a situation where we have several modules in the queue which are ready for review but have to wait a month or two, because another module is under review. In the case of std.path, that could mean that we'll have to wait nearly a month to get it in. That will likely push it back a whole release. So, I have mixed feelings on the matter. In principle, having only one module in review at a time is a good idea, but I fear that it will slow down our progress unnecessarily. Still, if that's what you want to do, we might as well go forward with it for now and review that decision if we end up with too many items on the back burner awaiting review. While there does appear to have been a bit of an uptick on possible modules for review of late, we haven't exactly had tons of them being put forth for review yet either. - Jonathan M Davis
Mar 06 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/6/11 8:03 PM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 17:35:32 Andrei Alexandrescu wrote:
 However, are you indicating that we should
 never have more than one module in review at a time? I see some benefit
 in spreading them out, on the other hand, if we have multiple modules
 ready for review, it seems like we could be slowing down progress
 unnecessarily if we ruled that we could only ever have one module under
 review at a time.
We should have only one review at a time. That way each review will be thorough. Boost does that, and I don't want to mess with success - particularly since the Boost community is larger too.
In the general case, that seems like a good idea. I just don't want to get in a situation where we have several modules in the queue which are ready for review but have to wait a month or two, because another module is under review. In the case of std.path, that could mean that we'll have to wait nearly a month to get it in. That will likely push it back a whole release. So, I have mixed feelings on the matter. In principle, having only one module in review at a time is a good idea, but I fear that it will slow down our progress unnecessarily. Still, if that's what you want to do, we might as well go forward with it for now and review that decision if we end up with too many items on the back burner awaiting review. While there does appear to have been a bit of an uptick on possible modules for review of late, we haven't exactly had tons of them being put forth for review yet either. - Jonathan M Davis
Yah, thing is people work on stuff they care about, not the most urgent stuff - surprise! :o) As such we don't have a ton of proposals for networking and xml, but we do have one (and I won't argue it's a bad one) for rehashing a module that basically worked. Andrei
Mar 06 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.
And it doesn't help that the people who may need a particular module aren't necessarily the same folks with the time and know-how to actually implement it... In any case, I think that it's safe to say that we can go forward with a "one review at a time" policy for now and revisit it if it ever becomes a problem. While I don't like the fact that std.path will be delayed, the occasional delay of a single module likely isn't a big deal. If we actually start get enough modules proposed for review that we actually get a bit of a queue going, _then_ it could be a problem. But until that happens, there isn't really much sense in worrying about it. I _was_ thinking of putting forward a new proposal which includes the unit testing functionality that assertPred had which won't end up in an improved assert, so having to wait for both std.parallelism and std.path to be fully reviewed is bit annoying, but it's not exactly urgent. It can wait if it has to. But both that and std.path may be able to have shorter review cycles than more complex proposals, simply because they're not as complex. Stuff like std.parallelism needs a thorough review. Stuff like std.path needs to be well- reviewed, but it doesn't really need as thorough a review, since it's much simpler functionality. So, if we end up with several smaller items for review, we may be able to move through those faster than several large ones anyway, and large ones are likely to be rarer simply due to the amount of work involved. In any case, we can go forward as you suggest with the "one review at time" policy and work with that if and until it becomes a problem. - Jonathan M Davis
Mar 06 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...
 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.
I'm not sure I'd say the current std.path "basically works", but I get what you mean.
 I _was_ thinking of putting forward a new proposal which includes the unit
 testing functionality that assertPred had which won't end up in an 
 improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
Mar 06 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...
 
 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.
I'm not sure I'd say the current std.path "basically works", but I get what you mean.
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though. - Jonathan M Davis
Mar 06 2011
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-03-07 01:20:25 -0500, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 Speaking of which: Now that assertPred has been rejected on the grounds of
 an improved assert that doesn't yet exist, what is the current status of
 the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
I gave it a try even before assertPred was rejected to check feasibility, made something in a few hours that should have mostly worked, but then realized I've been playing with the wrong assert code. There is apparently two code paths for asserts in DMD, one of which I'm not sure is used at all, and I took the wrong one to modify. I'll have to sort this out and possibly redo all this with the other code path (which seems a little more complicated because it relies on a per-module generated assert handler for some reason), but this'll have to wait until I have more time. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Mar 07 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
Yea, that's what I figured, and that's why I was strongly in favor of assertPred despite the "promise" of assert improvements. You're the sole author of assertPred, right? Do you mind if I include it in my zlib/libpng-licensed SemiTwist D Tools library ( http://www.dsource.org/projects/semitwist ) ? I already have an assert-alternative in there, but assertPred is vastly superior. (Although, my assert-alternative does save a list of failures instead of immediately throwing, which I personally find to be essential for unittests, so I would probably add the *optional* ability to have assertPred do the same.)
Mar 07 2011
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...
 
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...
 
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in
 an improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
Yea, that's what I figured, and that's why I was strongly in favor of assertPred despite the "promise" of assert improvements. You're the sole author of assertPred, right? Do you mind if I include it in my zlib/libpng-licensed SemiTwist D Tools library ( http://www.dsource.org/projects/semitwist ) ? I already have an assert-alternative in there, but assertPred is vastly superior. (Although, my assert-alternative does save a list of failures instead of immediately throwing, which I personally find to be essential for unittests, so I would probably add the *optional* ability to have assertPred do the same.)
Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you can use it for whatever Boost lets you do with it, and even if what you're doing isn't Boost compatible, it's fine with me if you use it anyway. I do intend to take some of its functionality which assert will never have (such as assertPred!("opCmp", "<") or assertPred!"opAssign") and make another proposal to add those, but that's going to have to wait until other stuff is reviewed, and it doesn't help with what assert is supposed to be doing anyway (such as assert(a == b)). I would really liked to have gotten assertPred into Phobos, fancy assert or no, but too many people just wanted assert to be better and thought that assertPred was unnecessary, overcomplicated, and/or overkill. - Jonathan M Davis
Mar 07 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 I _was_ thinking of putting forward a new proposal which includes 
 the
 unit testing functionality that assertPred had which won't end up in
 an improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
Yea, that's what I figured, and that's why I was strongly in favor of assertPred despite the "promise" of assert improvements. You're the sole author of assertPred, right? Do you mind if I include it in my zlib/libpng-licensed SemiTwist D Tools library ( http://www.dsource.org/projects/semitwist ) ? I already have an assert-alternative in there, but assertPred is vastly superior. (Although, my assert-alternative does save a list of failures instead of immediately throwing, which I personally find to be essential for unittests, so I would probably add the *optional* ability to have assertPred do the same.)
Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you can use it for whatever Boost lets you do with it, and even if what you're doing isn't Boost compatible, it's fine with me if you use it anyway.
Thanks.
 I do intend to take some of its functionality which assert will never have 
 (such
 as assertPred!("opCmp", "<") or assertPred!"opAssign") and make another 
 proposal
 to add those, but that's going to have to wait until other stuff is 
 reviewed, and
 it doesn't help with what assert is supposed to be doing anyway (such as
 assert(a == b)).

 I would really liked to have gotten assertPred into Phobos, fancy assert 
 or no,
 but too many people just wanted assert to be better and thought that 
 assertPred
 was unnecessary, overcomplicated, and/or overkill.
Yea. I have a little bit of experience with JUnit/NUnit. Compared to that, assertPred is trivial and perfectly straightforward.
Mar 07 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:il3tra$3gg$1 digitalmars.com...
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
 news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
Yea, that's what I figured, and that's why I was strongly in favor of assertPred despite the "promise" of assert improvements. You're the sole author of assertPred, right? Do you mind if I include it in my zlib/libpng-licensed SemiTwist D Tools library ( http://www.dsource.org/projects/semitwist ) ? I already have an assert-alternative in there, but assertPred is vastly superior. (Although, my assert-alternative does save a list of failures instead of immediately throwing, which I personally find to be essential for unittests, so I would probably add the *optional* ability to have assertPred do the same.)
Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you can use it for whatever Boost lets you do with it, and even if what you're doing isn't Boost compatible, it's fine with me if you use it anyway.
Thanks.
I've added it and made an optional 'autoThrow' flag that, if set to false, prevents a failure from immediately bailing out of the whole unittest (some people like that, like me, and others don't). http://www.dsource.org/projects/semitwist/changeset?new=%2F%40196&old=%2F%40193
Mar 08 2011
parent reply spir <denis.spir gmail.com> writes:
On 03/08/2011 09:25 AM, Nick Sabalausky wrote:
 "Nick Sabalausky"<a a.a>  wrote in message
 news:il3tra$3gg$1 digitalmars.com...
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
Yea, that's what I figured, and that's why I was strongly in favor of assertPred despite the "promise" of assert improvements. You're the sole author of assertPred, right? Do you mind if I include it in my zlib/libpng-licensed SemiTwist D Tools library ( http://www.dsource.org/projects/semitwist ) ? I already have an assert-alternative in there, but assertPred is vastly superior. (Although, my assert-alternative does save a list of failures instead of immediately throwing, which I personally find to be essential for unittests, so I would probably add the *optional* ability to have assertPred do the same.)
Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you can use it for whatever Boost lets you do with it, and even if what you're doing isn't Boost compatible, it's fine with me if you use it anyway.
Thanks.
I've added it and made an optional 'autoThrow' flag that, if set to false, prevents a failure from immediately bailing out of the whole unittest (some people like that, like me, and others don't). http://www.dsource.org/projects/semitwist/changeset?new=%2F%40196&old=%2F%40193
I like it as well. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 08 2011
parent "Nick Sabalausky" <a a.a> writes:
"spir" <denis.spir gmail.com> wrote in message 
news:mailman.2341.1299588465.4748.digitalmars-d puremagic.com...
 On 03/08/2011 09:25 AM, Nick Sabalausky wrote:
 "Nick Sabalausky"<a a.a>  wrote in message
 news:il3tra$3gg$1 digitalmars.com...
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2328.1299539399.4748.digitalmars-d puremagic.com...
 On Monday, March 07, 2011 12:43:00 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2297.1299478837.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
Yea, that's what I figured, and that's why I was strongly in favor of assertPred despite the "promise" of assert improvements. You're the sole author of assertPred, right? Do you mind if I include it in my zlib/libpng-licensed SemiTwist D Tools library ( http://www.dsource.org/projects/semitwist ) ? I already have an assert-alternative in there, but assertPred is vastly superior. (Although, my assert-alternative does save a list of failures instead of immediately throwing, which I personally find to be essential for unittests, so I would probably add the *optional* ability to have assertPred do the same.)
Yes. I'm the sole author. Feel free to re-use it. It's under Boost, so you can use it for whatever Boost lets you do with it, and even if what you're doing isn't Boost compatible, it's fine with me if you use it anyway.
Thanks.
I've added it and made an optional 'autoThrow' flag that, if set to false, prevents a failure from immediately bailing out of the whole unittest (some people like that, like me, and others don't). http://www.dsource.org/projects/semitwist/changeset?new=%2F%40196&old=%2F%40193
I like it as well.
If you do use it, and have autoThrow set to false, be aware that it doesn't *yet* catch exceptions that are thrown from the actual code being tested. Ie: unittest { autoThrow = true; // Ie, the default (unless you use the unittestSection mixin) // A: AssertError is thrown, not caught and unittest bails out assertPred!"a"(false); // B: Exception is thrown, not caught and unittest bails out assertPred!"throw new Exception()"(10); autoThrow = false; // C: Error message is displayed, assertCount is incremented, unittest continues assertPred!"a"(false); // D: *Should* do same as C, but currently does same as B assertPred!"throw new Exception()"(10); } void main() { // If autoThrow is false and there were any failures, // then this throws an actual AssertError flushAsserts(); // Rest of main here } I plan to fix that though.
Mar 08 2011
prev sibling parent reply spir <denis.spir gmail.com> writes:
On 03/07/2011 07:20 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.
I'm not sure I'd say the current std.path "basically works", but I get what you mean.
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
IIUC: The problem is this feature belongs to the category of things that cannot be implemented by any D programmer, in D, as a lib feature, even by an expert in the domain. It needs to get a representation of the unevaluated expression beeing asserted, meaning compiler support, meaning hard low-level C/++ and a great knowledge of the compiler architecture, esp the construction of the AST. If there was a way to "quote" D expressions, and get their representation at runtime, then we could do it ourselves (would imply some perf penalty, but I consider this worth compared to the terrible expressive power gained, and in fact totally neglectible for an assert statement). Please tell me where I'm wrong. With the same power, I would implement at once 'varWrite': int x = 3; s = square(x); varWrite("value: 'x' --> square: 's'"); // --> "value: 3 --> square: 9" or even maybe: int x = 3; varWrite("value: 'x' --> square: 'x*x'"); // --> "value: 3 --> square: 9" Denis -- _________________ vita es estrany spir.wikidot.com
Mar 07 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-03-07 13:55, spir wrote:
 On 03/07/2011 07:20 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most
 urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.
I'm not sure I'd say the current std.path "basically works", but I get what you mean.
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
IIUC: The problem is this feature belongs to the category of things that cannot be implemented by any D programmer, in D, as a lib feature, even by an expert in the domain. It needs to get a representation of the unevaluated expression beeing asserted, meaning compiler support, meaning hard low-level C/++ and a great knowledge of the compiler architecture, esp the construction of the AST. If there was a way to "quote" D expressions, and get their representation at runtime, then we could do it ourselves (would imply some perf penalty, but I consider this worth compared to the terrible expressive power gained, and in fact totally neglectible for an assert statement). Please tell me where I'm wrong. With the same power, I would implement at once 'varWrite': int x = 3; s = square(x); varWrite("value: 'x' --> square: 's'"); // --> "value: 3 --> square: 9" or even maybe: int x = 3; varWrite("value: 'x' --> square: 'x*x'"); // --> "value: 3 --> square: 9" Denis
String mixins ? -- /Jacob Carlborg
Mar 07 2011
parent spir <denis.spir gmail.com> writes:
On 03/07/2011 02:36 PM, Jacob Carlborg wrote:
 On 2011-03-07 13:55, spir wrote:
 On 03/07/2011 07:20 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 21:57:30 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com> wrote in message
 news:mailman.2293.1299467610.4748.digitalmars-d puremagic.com...

 On Sunday 06 March 2011 18:08:49 Andrei Alexandrescu wrote:
 Yah, thing is people work on stuff they care about, not the most
 urgent
 stuff - surprise! :o) As such we don't have a ton of proposals for
 networking and xml, but we do have one (and I won't argue it's a bad
 one) for rehashing a module that basically worked.
I'm not sure I'd say the current std.path "basically works", but I get what you mean.
 I _was_ thinking of putting forward a new proposal which includes the
 unit testing functionality that assertPred had which won't end up in an
 improved
 assert,
Speaking of which: Now that assertPred has been rejected on the grounds of an improved assert that doesn't yet exist, what is the current status of the improved assert?
There's an enhancement request for it: http://d.puremagic.com/issues/show_bug.cgi?id=5547 I have no idea of any work is actually being done on it or not. It hasn't actually been assigned to anyone yet, for whatever that's worth. Honestly, it wouldn't surprise me if it doesn't happen for a while. I'm not sure that anyone who is capable of doing it is particularly motivated to do it (though I'm not sure that they're _not_ either). It was clear that a number of people wanted assert to be smarter rather than having assertPred, but it isn't clear that assert is going to be made smarter any time soon. I suspect that it will be a while before it's done. We'll have to wait and see though.
IIUC: The problem is this feature belongs to the category of things that cannot be implemented by any D programmer, in D, as a lib feature, even by an expert in the domain. It needs to get a representation of the unevaluated expression beeing asserted, meaning compiler support, meaning hard low-level C/++ and a great knowledge of the compiler architecture, esp the construction of the AST. If there was a way to "quote" D expressions, and get their representation at runtime, then we could do it ourselves (would imply some perf penalty, but I consider this worth compared to the terrible expressive power gained, and in fact totally neglectible for an assert statement). Please tell me where I'm wrong. With the same power, I would implement at once 'varWrite': int x = 3; s = square(x); varWrite("value: 'x' --> square: 's'"); // --> "value: 3 --> square: 9" or even maybe: int x = 3; varWrite("value: 'x' --> square: 'x*x'"); // --> "value: 3 --> square: 9" Denis
String mixins ?
Works not, strings must be known at compile-time. And I don't want black magic. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 07 2011
prev sibling parent spir <denis.spir gmail.com> writes:
On 03/07/2011 01:44 AM, Jonathan M Davis wrote:
 I think whatever you choose will not please everybody, so just choose
  something and stick with it. Regarding all the extension naming stuff, I
  suggest you go with the "suffix" nomenclature which is more general and
  applicable to all OSs.
I agree with Lars on this one. Everyone knows what an extension is. It's a universal concept even if it's not used as much on non-Windows OSes. There _are_ plenty of programs in *nix which use it internally (likely because it's a lot easier than dealing with mime type) even if they shouldn't.
eg: numerous compilers, programming editors,... ;-) Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 04:31:20 Lars T. Kyllingstad wrote:
 On Sat, 05 Mar 2011 16:32:55 +0000, Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep * currentDirSymbol, currentDirSym, curDirSymbol * basename, baseName, filename, fileName * dirname, dirName, directory, getDir, getDirName * drivename, driveName, drive, getDrive, getDriveName * extension, ext, getExt, getExtension * stripExtension, stripExt (The same convention will be used for stripExtension, replaceExtension and defaultExtension.)
In summary, it seems currentDirSymbol, baseName, dirName and driveName are clear winners. Less clear, but still voted for by the majority, are extension and stripExtension. It is a tie between dirSep and dirSeparator. Below are the votes I counted. And before you say "hey, I didn't know we could make suggestions of our own", or "why did that guy get several votes?", this was by no means a formal vote. It was just trying to get a feel for people's preferences. Before the module gets accepted into Phobos there will have to be a formal review process, so there is still a lot of opportunity to fight over naming. :) dirSep: 3 (Nick Sabalausky, spir, Jonathan M. Davis) dirSeparator: 3 (Bekenn, Jim, J Chapman) currDirSym: 1 (Jonathan M. Davis) currDirSymbol: 2 (Nick Sabalausky, Jonathan M. Davis) path.current: 1 (Andrej Mitrovic) currentDirSymbol: 4 (Bekenn, Jim, J Chapman, spir) baseName: 6 (Nick Sabalausky, Bekenn, Jim, J Chapman, spir, Jonathan M. Davis) baseFileName: 1 (Nick Sabalausky) fileName: 1 (spir) basename: 1 (Andrei Alexandrescu) dirName: 6 (Nick Sabalausky, Bekenn, Jim, spir, Jonathan M. Davis, David Nadlinger) directory: 1 (Nick Sabalausky) getDirName: 2 (J Chapman, spir) dirname: 1 (Andrei Alexandrescu) driveName: 4 (Nick Sabalausky, Bekenn, Jim, spir) drive: 2 (Nick Sabalausky, Jonathan M. Davis) getDriveName: 2 (J Chapman, spir) driveLetter: 1 (Jonathan M. Davis) ext: 1 (Nick Sabalausky) extension: 2 (Bekenn, Jim) getExtension: 1 (J Chapman) stripExt: 2 (Nick Sabalausky, Jonathan M. Davis) stripExtension: 3 (Bekenn, Jim, J Chapman)
This is a very small sampling of even the folks here on the newsgroup, let alone the D community at large, so I don't think that you can really base all _that_ much off of the votes. Rather, I think that you should pretty much do what Andrei said and pick what you think is best, but now you have some opinions and arguments from other people that you can take into consideration when naming the functions. As Andrei said, you're never going to get everyone to agree anyway. I think that the general guidelines here should be that the names be descriptive but as short as they can reasonably be and still be appropriately descriptive. Names which are not descriptive enough are likely to not be clear enough, but names that are very descriptive but as very long are likely to get very annoying - especially if you have to use them often and/or have to deal with a character limit per line. So, take what has been said into consideration and adjust the names as you think is appropriate. I'm sure that they'll get debated further when you actually put it up for a full review. But naming is arguably _the_ classic bike shedding issue. It matters but not in proportion with the amount of discussion and arguing that it gets, and you'll _never_ get everyone to agree over it. On a side note, any functions that have changed behavior should probably have names which are different from what's currently in std.path. So, for instance, if your basename function has different behavior from the current std.path's basename, you should probably give it a different name (in this case, the obvious solution is baseName - it actually follows Phobos' naming conventions and was the pretty clear favorite in this discussion). Otherwise, you're going to break code when your code gets merged into Phobos. If the behavioral change is small, the perhaps a new name is not necessary, but I know that Walter is _very_ much against breaking code with changes to Phobos, and silently changing behavior on someone is one of the worst ways to do that. Fortunately, I believe that pretty much all of your functions have new names, but that _is_ something to consider when naming stuff. - Jonathan M Davis
Mar 06 2011
prev sibling next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
dirSep
curDirSymbol
baseName
directory
drive
ext
stripExt

I would actually prefer getDir, getDrive and getExt if there was a  
corresponding getName (instead of baseName).

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
Mar 07 2011
prev sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Sat, 05 Mar 2011 16:32:55 -0000, Lars T. Kyllingstad  
<public kyllingen.nospamnet> wrote:
 The only disagreement seems to be about the naming, so let's have a round
 of voting.  Here are a few alternatives for each function.  Please say
 which ones you prefer.

  * dirSeparator, dirSep, sep
  * currentDirSymbol, currentDirSym, curDirSymbol
  * basename, baseName, filename, fileName
  * dirname, dirName, directory, getDir, getDirName
  * drivename, driveName, drive, getDrive, getDriveName
  * extension, ext, getExt, getExtension
  * stripExtension, stripExt
Is it just me that feels dirName and getDirName are ambiguous? i.e. in the path: c:\temp\folder\name\file.ext There are 3 directories: - Their "names" are 'temp', 'folder' and 'name' - Their "paths" are c:\temp, c:\temp\folder and c:\temp\folder\name It's the reason I think baseName is clearer than fileName, with fileName you're not sure if it means the complete/full filename including directories or just the filename itself, with or without extension. baseName (perhaps once you're used to the idea of it) implies the shorter form. In fact.. why not call baseName on directories too, to remove the leading path components. e.g. getDir("c:\temp\folder\name\file.ext") -> "c:\temp\folder\name" baseName(getDir("c:\temp\folder\name\file.ext")) -> "name" -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Mar 07 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 05 March 2011 17:43:50 Andrej Mitrovic wrote:
 I dunno, maybe I'd prefer an enum.
 
 enum path : string { current = ".", up = ".." };
 
 main() { string newPath = join("C:", "Windows", "Subdir", path.up,
 path.up, "Program Files");
 newPath == r"C:\Windows\Subdir\..\..\Program Files";
 
 This is just nitpicking however. And 'current' is only used on Linux afaik?
 :)
I have no idea what's used on Windows. I rarely use it these days. - Jonathan M Davis
Mar 05 2011
parent Adam Ruppe <destructionator gmail.com> writes:
current == "." on Windows too.
Mar 05 2011
prev sibling next sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
Looks good overall. I have a few comments and nitpicks though:

   basename("dir/subdir/")             -->  "subdir"
   directory("dir/subdir/")      -->  "dir"
Is this what everybody expects? I'm not sure, but another possibility would be to treat these as if "dir/subdir/." is passed. What is the result of directory("/") or directory("d:/")?
   extension("file")               -->  ""
   extension("file.ext")           -->  "ext"
What about "file."? I tried it on NTFS, but trailing '.' seems to always be cut off. Is it possible to create such a file on unix systems? If yes, you won't be able to recreate it from the result of basename() and extension(). What about network shares like "\\server\share\dir\file"? Maybe it should also be shown in the examples? Does the "\\server" part need special consideration? Rainer Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I 
 started working on a rewrite of std.path a long time ago, but I got 
 sidetracked by other things.  The recent discussion got me working on it 
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of 
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are 
 toAbsolute() and toCanonical, because they rely on std.file.getcwd() 
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current 
 std.path.  See the other thread for some examples, or take a look at the 
 unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
 
 -Lars
Mar 06 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 00:37:15 Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"
 
 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?
How about baseName("dir/subdir/") --> "subdir/" dirName("dir/subdir/") --> "dir" There _are_ programs (such as rsync) which care about whether a / is included at the end of the path. Doing that should also deal with the "/" and "d:/" issue. So, I can see why Lars would have made the base name of "dir/subdir" be "subdir" instead of "subdir/" (I don't know whether that's the current behavior or not, so he may just have copied it from what's currently there), but It seems to me that it will be more consistent to truet "subdir/" as the base name of "dir/subdir". Unfortunately, sometimes there _is_ a difference between "subdir" and "subdir/".
  >   extension("file")               -->  ""
  >   extension("file.ext")           -->  "ext"
 
 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
*nix doesn't really do anything special with any file names. The closest is files which start with "." - most programs consider those to be hidden and don't show them. There's definitely no problem with using "file." as a file name. This is probably a good argument for putting the "." back in the extension like it was before.
 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?
Probably, unfortunately. \\ is kind of like a drive letter, so it really should be special cased, I think. - Jonathan M Davis
Mar 06 2011
prev sibling next sibling parent reply =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:
Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
=20
   basename("dir/subdir/")             -->  "subdir"
   directory("dir/subdir/")      -->  "dir"
=20
I would say: basename ("dir/subdir/") -> "" (or ".") dirname ("dir/subdir/") -> "dir/subdir" basename ("dir/subdir") -> "subdir" dirname ("dir/subdir") -> "dir" Same as Python does.
 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?
=20
   extension("file")               -->  ""
   extension("file.ext")           -->  "ext"
=20
extension ("file") -> "" extension ("file.ext") -> ".ext" extension ("file.") -> "."
 What about "file."? I tried it on NTFS, but trailing '.' seems to alway=
s
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and=
 extension().
=20
Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Mar 06 2011
parent spir <denis.spir gmail.com> writes:
On 03/06/2011 12:50 PM, "Jérôme M. Berger" wrote:
 Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:

    basename("dir/subdir/")             -->   "subdir"
    directory("dir/subdir/")      -->   "dir"
I would say: basename ("dir/subdir/") -> "" (or ".") dirname ("dir/subdir/") -> "dir/subdir" basename ("dir/subdir") -> "subdir" dirname ("dir/subdir") -> "dir" Same as Python does.
 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?

    extension("file")               -->   ""
    extension("file.ext")           -->   "ext"
extension ("file") -> "" extension ("file.ext") -> ".ext" extension ("file.") -> "."
 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
This solves the issue of recomposing a file path/name from its parts. But it's not what people mean, expect, and need with the notion of extension. We would have to remember this (weird) behaviour of the extension() function; and systematically write strip off starting '.'. Then, we get caught when the result is ""! Thus, we must add a check: extension = path.extension(foo); if (extension[0] == '.') extension = extension[1..$]; Very nice... Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
prev sibling next sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:
 
  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"
 
 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.
I don't know about everybody, but it is what *NIX users expect, at least. I have written those functions so they adhere to the POSIX requirements for the 'basename' and 'dirname' commands.
 What is the
 result of directory("/") or directory("d:/")?
"/" and "d:/", respectively. The first is what 'dirname' prints, and the second is the natural extension to Windows paths. (I believe I have covered most corner cases in the unittests. I think it would just be confusing to add all of them to the documentation.)
  >   extension("file")               -->  "" extension("file.ext")      
  >       -->  "ext"
 
 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
Good point. I don't know if there is any kind of precedent here. What do others think?
 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?
Hmm.. that's another good point. I haven't even though of those, but they should probably be covered as well. I'll look into it. -Lars
Mar 06 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 03:56:53 Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"
 
 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.
I don't know about everybody, but it is what *NIX users expect, at least. I have written those functions so they adhere to the POSIX requirements for the 'basename' and 'dirname' commands.
If there's a standard way to deal with that, then that's probably best.
 What is the
 result of directory("/") or directory("d:/")?
"/" and "d:/", respectively. The first is what 'dirname' prints, and the second is the natural extension to Windows paths. (I believe I have covered most corner cases in the unittests. I think it would just be confusing to add all of them to the documentation.)
  >   extension("file")               -->  "" extension("file.ext")
  >   
  >       -->  "ext"
 
 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
Good point. I don't know if there is any kind of precedent here. What do others think?
I kind of like how your extension doesn't include the "." in it, since you'd often want to remove it anyway, but given this particular ambiguity, I think that it's probably better to go with the old way of including the "." in the extension. - Jonathan M Davis
Mar 06 2011
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:ikvsq5$1qr9$2 digitalmars.com...
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.
I don't know about everybody, but it is what *NIX users expect, at least. I have written those functions so they adhere to the POSIX requirements for the 'basename' and 'dirname' commands.
I initially felt somewhat uncomfortable with the idea of that behavior, but then I realized two things: 1. You don't have to constantly worry about "trailing slash" vs "no trailing slash" and remember the different semantics. (The "trailing slash" vs "no trailing slash" matter can be a real pain.) 2. It'll always treat a path to a directory the same way as a path to a file. (Consistency is nice. Especially since you don't always know if something is intended to be a file or directory.)
Mar 06 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 03/06/2011 12:56 PM, Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

   >    basename("dir/subdir/")             -->   "subdir"
   >    directory("dir/subdir/")      -->   "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.
I don't know about everybody, but it is what *NIX users expect, at least. I have written those functions so they adhere to the POSIX requirements for the 'basename' and 'dirname' commands.
 What is the
 result of directory("/") or directory("d:/")?
"/" and "d:/", respectively. The first is what 'dirname' prints, and the second is the natural extension to Windows paths. (I believe I have covered most corner cases in the unittests. I think it would just be confusing to add all of them to the documentation.)
   >    extension("file")               -->   "" extension("file.ext")
   >        -->   "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
Good point. I don't know if there is any kind of precedent here. What do others think?
 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?
Hmm.. that's another good point. I haven't even though of those, but they should probably be covered as well. I'll look into it.
What about extending the notion of 'device' (see other post) to cover 'http://' and "ftp://"? Would it be complicated? Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:

 On 03/06/2011 12:56 PM, Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 Looks good overall. I have a few comments and nitpicks though:

   >    basename("dir/subdir/")             -->   "subdir"
   >    directory("dir/subdir/")      -->   "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed.
I don't know about everybody, but it is what *NIX users expect, at least. I have written those functions so they adhere to the POSIX requirements for the 'basename' and 'dirname' commands.
 What is the
 result of directory("/") or directory("d:/")?
"/" and "d:/", respectively. The first is what 'dirname' prints, and the second is the natural extension to Windows paths. (I believe I have covered most corner cases in the unittests. I think it would just be confusing to add all of them to the documentation.)
   >    extension("file")               -->   "" extension("file.ext")
   >        -->   "ext"

 What about "file."? I tried it on NTFS, but trailing '.' seems to
 always be cut off. Is it possible to create such a file on unix
 systems? If yes, you won't be able to recreate it from the result of
 basename() and extension().
Good point. I don't know if there is any kind of precedent here. What do others think?
 What about network shares like "\\server\share\dir\file"? Maybe it
 should also be shown in the examples? Does the "\\server" part need
 special consideration?
Hmm.. that's another good point. I haven't even though of those, but they should probably be covered as well. I'll look into it.
What about extending the notion of 'device' (see other post) to cover 'http://' and "ftp://"? Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio. -Lars
Mar 06 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio.
If std.path doesn't handle uri's, then we'd need a whole other set of functions for dealing with uris. And at least a few of the functions would overlap. And then people who want to be able to handle both files and uris will want functions that will seamlessly handle either. So I think it really would be best to just bite the bullet and have std.path handle uri's. That said, I'm not sure this would be necessary for round 1 of the new std.path. Could just be added later.
Mar 06 2011
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 13:49:59 Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio.
If std.path doesn't handle uri's, then we'd need a whole other set of functions for dealing with uris. And at least a few of the functions would overlap. And then people who want to be able to handle both files and uris will want functions that will seamlessly handle either. So I think it really would be best to just bite the bullet and have std.path handle uri's. That said, I'm not sure this would be necessary for round 1 of the new std.path. Could just be added later.
We do have std.uri, though it's pretty bare-boned at the moment. - Jonathan M Davis
Mar 06 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 03/06/2011 10:49 PM, Nick Sabalausky wrote:
 "Lars T. Kyllingstad"<public kyllingen.NOSPAMnet>  wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio.
If std.path doesn't handle uri's, then we'd need a whole other set of functions for dealing with uris. And at least a few of the functions would overlap. And then people who want to be able to handle both files and uris will want functions that will seamlessly handle either. So I think it really would be best to just bite the bullet and have std.path handle uri's. That said, I'm not sure this would be necessary for round 1 of the new std.path. Could just be added later.
Right, but if there is reasonable probability for such an extension, then we must think at it, so-to-say "at design time". Else, various common issues will raise barriers on the way of extension (existing codebase, detail conflicts, refactoring requirements... naming! ;-) (*) Then, once such work is on good way, possibly implementation is no more such a big deal. Or, conversely, we may feel the need for prototyping and trials to construct and/or validate a big picture design. Etc... To sum up: since there is no emergency (--> Andrei's last post), we have a very good base thank to Lars's well-thought job, and there are already a number of people involved in the discussion -- why not? Denis (*) drive name --> ? -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 06 Mar 2011 16:49:59 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio.
If std.path doesn't handle uri's, then we'd need a whole other set of functions for dealing with uris. And at least a few of the functions would overlap. And then people who want to be able to handle both files and uris will want functions that will seamlessly handle either. So I think it really would be best to just bite the bullet and have std.path handle uri's.
I am now certain that std.path should not give URIs any kind of special treatment, for the simple reason that most URIs are also valid paths on POSIX. Specifically, file and directory names may contain the ':' character, and multiple consecutive slashes are treated as a single slash. In other words, you can do this: mkdir http: mkdir http://www.digitalmars.com cd http://www.digitalmars.com That means std.path should treat "http:" as just another path component, and it should treat "//" on equal footing with "/". This is how it's done now, and it is how it should be. -Lars
Mar 07 2011
next sibling parent Jim <bitcirkel yahoo.com> writes:
Lars T. Kyllingstad Wrote:

 On Sun, 06 Mar 2011 16:49:59 -0500, Nick Sabalausky wrote:
 
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio.
If std.path doesn't handle uri's, then we'd need a whole other set of functions for dealing with uris. And at least a few of the functions would overlap. And then people who want to be able to handle both files and uris will want functions that will seamlessly handle either. So I think it really would be best to just bite the bullet and have std.path handle uri's.
I am now certain that std.path should not give URIs any kind of special treatment, for the simple reason that most URIs are also valid paths on POSIX. Specifically, file and directory names may contain the ':' character, and multiple consecutive slashes are treated as a single slash. In other words, you can do this: mkdir http: mkdir http://www.digitalmars.com cd http://www.digitalmars.com That means std.path should treat "http:" as just another path component, and it should treat "//" on equal footing with "/". This is how it's done now, and it is how it should be. -Lars
Not quite sure it would be that easy. http://en.wikipedia.org/wiki/URI_scheme
Mar 07 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:il28cm$2phc$1 digitalmars.com...
 On Sun, 06 Mar 2011 16:49:59 -0500, Nick Sabalausky wrote:

 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:il09fp$2h5d$1 digitalmars.com...
 On Sun, 06 Mar 2011 15:54:19 +0100, spir wrote:
 What about extending the notion of 'device' (see other post) to cover
 'http://' and "ftp://"?
 Would it be complicated?
I don't think std.path should handle general URIs. It should only have to deal with the kind of paths you can pass to the functions in std.file and std.stdio.
If std.path doesn't handle uri's, then we'd need a whole other set of functions for dealing with uris. And at least a few of the functions would overlap. And then people who want to be able to handle both files and uris will want functions that will seamlessly handle either. So I think it really would be best to just bite the bullet and have std.path handle uri's.
I am now certain that std.path should not give URIs any kind of special treatment, for the simple reason that most URIs are also valid paths on POSIX. Specifically, file and directory names may contain the ':' character, and multiple consecutive slashes are treated as a single slash. In other words, you can do this: mkdir http: mkdir http://www.digitalmars.com cd http://www.digitalmars.com That means std.path should treat "http:" as just another path component, and it should treat "//" on equal footing with "/". This is how it's done now, and it is how it should be.
I really wish that wasn't such a good argument. I'm now convinced too, albiet reluctantly. Like anyone else, I certainly beleive that MS has made a number of bad calls about certain things. But this is once case where I actually wish unix worked the windows way: If unix weren't so permissive about filename chars, then we wouldn't have such ambiguities. Oh well. At least URI's have the file:/// protocol, so at least you can treat local and remote the same if you assume everything to be interpreted as a URI. I just wish it were possible to actually *detect* URI vs filepath outside of windows.
Mar 07 2011
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:
 
 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
Good point. I don't know if there is any kind of precedent here. What do others think?
Maybe special casing similar to the "hidden" files starting with '.': basename("file.") --> "file." extension("file.") --> ""
Mar 06 2011
parent spir <denis.spir gmail.com> writes:
On 03/06/2011 04:41 PM, Rainer Schuetze wrote:
 Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 09:37:15 +0100, Rainer Schuetze wrote:

 What about "file."? I tried it on NTFS, but trailing '.' seems to always
 be cut off. Is it possible to create such a file on unix systems? If
 yes, you won't be able to recreate it from the result of basename() and
 extension().
Good point. I don't know if there is any kind of precedent here. What do others think?
Maybe special casing similar to the "hidden" files starting with '.': basename("file.") --> "file." extension("file.") --> ""
I agrre, and this is probably the correct solution: if there is nothing after the dot, then it's not an extension separator, thus it's part of the baseName (just like if there is nothing before the dot). Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
prev sibling next sibling parent reply spir <denis.spir gmail.com> writes:
On 03/06/2011 09:37 AM, Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
I think all your questions are sensible, Rainer.
    basename("dir/subdir/")             -->  "subdir"
    directory("dir/subdir/")      -->  "dir"
Is this what everybody expects? I'm not sure, but another possibility would be to treat these as if "dir/subdir/." is passed. What is the result of directory("/") or directory("d:/")?
Depends. We must make clear whether such funcs work: 1. indifferently for file and dir names, in which case we get the above results, 2. differently for file & dir names, in which case we would have "dir/subdir/" as result of both operations above, 3. only for file names, in which case we throw an error when these functions are called on dir names. I find both solutions 1. and 2. conceptually problematic; the second one only a bit less. Maybe the only sensible choice is 3.?
    extension("file")               -->  ""
    extension("file.ext")           -->  "ext"
What about "file."? I tried it on NTFS, but trailing '.' seems to always be cut off. Is it possible to create such a file on unix systems? If yes, you won't be able to recreate it from the result of basename() and extension().
This is /really/ problematic, indeed! The splitting operation *must* be reversable in all cases. In other other words, file name/path recomposition must be symmetric of splitting it.
 What about network shares like "\\server\share\dir\file"? Maybe it should also
 be shown in the examples? Does the "\\server" part need special consideration?
I think there should be a special case similar to windows drive names. Maybe, instead of a notion of drive, have a notion of 'device', which could then cover network connexion. Then, a full file path/name would be composed of: deviceName | dirName || baseName | extension One issue is defining the appropriate 'joint'/sep between deviceName & dirName. (See split <--> recomposition above.) What do you think? Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
parent reply =?UTF-8?B?IkrDqXLDtG1lIE0uIEJlcmdlciI=?= <jeberger free.fr> writes:
spir wrote:
 On 03/06/2011 09:37 AM, Rainer Schuetze wrote:
 Looks good overall. I have a few comments and nitpicks though:
=20 I think all your questions are sensible, Rainer. =20
    basename("dir/subdir/")             -->  "subdir"
    directory("dir/subdir/")      -->  "dir"
Is this what everybody expects? I'm not sure, but another possibility would be to treat these as if "dir/subdir/." is passed. What is the result of directory("/") or directory("d:/")?
=20 Depends. We must make clear whether such funcs work: 1. indifferently for file and dir names, in which case we get the above=
 results,
 2. differently for file & dir names, in which case we would have
 "dir/subdir/" as result of both operations above,
 3. only for file names, in which case we throw an error when these
 functions are called on dir names.
=20
 I find both solutions 1. and 2. conceptually problematic; the second on=
e
 only a bit less. Maybe the only sensible choice is 3.?
=20
This does not make sense because there is no way to tell whether "foo/bar" is intended as a file name or a dir name. IMO the only sensible thing to do is to split on the last path separator: everything to the right is the base name (or everything if there is no separator) and everything to the left is the dir name. This has the two very important advantages: - It is a simple rule, so is easy to remember; - It does not need the path to exists and it does not need to know whether the path is intended as a file or dir. Jerome --=20 mailto:jeberger free.fr http://jeberger.free.fr Jabber: jeberger jabber.fr
Mar 06 2011
parent "Nick Sabalausky" <a a.a> writes:
""Jérôme M. Berger"" <jeberger free.fr> wrote in message 
news:il0f04$2ts8$1 digitalmars.com...
 This does not make sense because there is no way to tell whether
"foo/bar" is intended as a file name or a dir name. IMO the only
sensible thing to do is to split on the last path separator:
everything to the right is the base name (or everything if there is
no separator) and everything to the left is the dir name. This has
the two very important advantages:

- It is a simple rule, so is easy to remember;`
But it doesn't have simple consequences. If I'm trying to refer to a particular directory there's a good chance it could be either "/foo/bar" or "/foo/bar/" (and the latter is *not* typically thought of as a shorthand for "/foo/bar/."). Those are conceptually the *exact same thing*, but with the "last slash" rule you suggest, they have wildy different effects when passed to certain std.path functions. Most notably, if it's a path with a trailing slash, then dirName **no longer returns the directory that *contains* the element specified**. It just returns the element itself *instead* of its containing directory. So, since certain functions would have notably different effects with and without a trailing slash, and the trailing slash may or may not have been given (since the two styles are typically thought of as interchangable), every time you call a std.path functions the "last slash" rule would force you to go through these steps: 1. Remember if the function you're using is one that's affected. 2. If so, decide which semantics you want. 3. Detect if the "trailing-slashness" of your string matches the semantics you want. Which may, in fact, be impossible: If the semantics you desire dictate a trailing slash on directories, and your string lacks a trailing slash then the *only* way to proceed correctly is to know whether it's intended to be a file or a directory, and you don't always know. 4. Coerce your string to match the desired semantics, if possible. 5. Finally call the dammed function.
- It does not need the path to exists and it does not need to know
whether the path is intended as a file or dir.
As I described above, it will sometimes need to know. Alternatively, the current behavior of Lars's proposed std.path is, to the human mind, an equally simple rule and therefore equally simple to remember: That last element is the baseName and all the elements before it are the dirName. In contrast to the "last slash" rule, this "last element" rule behaves exactly the same regardless of whether a trailing slash was appended or omitted and *actually* never needs to know if the path is intended as a file or dir. So the five steps above get condensed down to one: 1. Just call the dammed function. I'll admit, the "last element" behavior of Lars's proposed std.path did raise a small red flag to me at first. But the more I think about it, the more I think it's the best way to go.
Mar 06 2011
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sun, 06 Mar 2011 10:37:15 +0200, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 What about "file."? I tried it on NTFS, but trailing '.' seems to always  
 be cut off.
It's possible to create files and directories with one trailing dot on Windows/NTFS. FAR Manager allows doing this, for example. I'm not sure if the implementation does anything special to achieve this, but it's not impossible. (Ditto with leading and trailing spaces.) By the way, not sure if it's been mentioned in this discussion but: ".exe" is an executable file with no name. It's perfectly valid. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Mar 06 2011
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 3/6/11, Vladimir Panteleev <vladimir thecybershadow.net> wrote:
 ".exe" is an executable file with no name. It's perfectly valid.
Although for some reason Explorer never lets you do that. Well, I have a hotkey for creating filenames so I just let autohotkey create the file such as ".file". Doing it in explorer via rename gets: "You must type a file name".
Mar 06 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Vladimir Panteleev" <vladimir thecybershadow.net> wrote in message 
news:op.vrxw6dmltuzx1w cybershadow.mshome.net...
 On Sun, 06 Mar 2011 10:37:15 +0200, Rainer Schuetze <r.sagitario gmx.de> 
 wrote:

 What about "file."? I tried it on NTFS, but trailing '.' seems to always 
 be cut off.
It's possible to create files and directories with one trailing dot on Windows/NTFS. FAR Manager allows doing this, for example. I'm not sure if the implementation does anything special to achieve this, but it's not impossible. (Ditto with leading and trailing spaces.) By the way, not sure if it's been mentioned in this discussion but: ".exe" is an executable file with no name. It's perfectly valid.
It ain't valid when optlink creates it ;)
Mar 06 2011
prev sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Sun, 06 Mar 2011 08:37:15 -0000, Rainer Schuetze <r.sagitario gmx.de>  
wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility  
 would be to treat these as if "dir/subdir/." is passed. What is the  
 result of directory("/") or directory("d:/")?
?? I would expect: directory("dir/subdir/") --> "dir/subdir" as subdir _is_ a dir, not a file, as shown by the trailing slash. If it was: directory("dir/subdir") --> "dir" as subdir is perhaps not a directory, as there is no trailing slash. I realise this means the trailing slash becomes important, but it kinda is important as it does tell us when something is definitely a directory. Alternately, we could ignore the distinction between file and directory - as we're essentially just parsing strings here - and have two functions: lastComponent("dir/subdir/") -> "subdir" lastComponent("dir/subdir") -> "subdir" allButLastComponent("dir/subdir/") -> "dir/" allButLastComponent("dir/subdir") -> "dir/" -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Mar 07 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Mon, 07 Mar 2011 10:25:21 +0000, Regan Heath wrote:

 On Sun, 06 Mar 2011 08:37:15 -0000, Rainer Schuetze <r.sagitario gmx.de>
 wrote:
 
 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?
?? I would expect: directory("dir/subdir/") --> "dir/subdir" as subdir _is_ a dir, not a file, as shown by the trailing slash. If it was: directory("dir/subdir") --> "dir" as subdir is perhaps not a directory, as there is no trailing slash. I realise this means the trailing slash becomes important, but it kinda is important as it does tell us when something is definitely a directory.
I don't think it does, or rather, I don't think there is such a thing as "definitely a directory". What about a symlink to a directory, for instance? On one hand, it *is* a file that contains a reference to a directory, and on the other, in most respects it *acts like* a directory. You can even argue that a "file" is simply the term used for a node in the filesystem tree, and that "directory" is a special kind of file that contains a list of other files. This terminology is pretty standard in *NIX land, at least. (Just google "everything is a file".)
 Alternately, we could ignore the distinction between file and directory
 - as we're essentially just parsing strings here - and have two
 functions:
 
 lastComponent("dir/subdir/")  -> "subdir" lastComponent("dir/subdir")  
 -> "subdir"
 
 allButLastComponent("dir/subdir/") -> "dir/"
 allButLastComponent("dir/subdir")  -> "dir/"
That's how it's done now, and how I think it should be. The two paths "dir/subdir" and "dir/subdir/" both refer to the same object in the file system, namely "subdir". baseName gives you the name of the object referred to by a path, while dirName gives you the directory containing said object. Whether that object is a file or a directory is irrelevant. (And if you need to know what it is, there is always std.file.isDir and isFile.) -Lars
Mar 07 2011
next sibling parent spir <denis.spir gmail.com> writes:
On 03/07/2011 01:08 PM, Lars T. Kyllingstad wrote:
 Alternately, we could ignore the distinction between file and directory
  - as we're essentially just parsing strings here - and have two
  functions:

  lastComponent("dir/subdir/")  ->  "subdir" lastComponent("dir/subdir")
  ->  "subdir"

  allButLastComponent("dir/subdir/") ->  "dir/"
  allButLastComponent("dir/subdir")  ->  "dir/"
That's how it's done now, and how I think it should be. The two paths "dir/subdir" and "dir/subdir/" both refer to the same object in the file system, namely "subdir". baseName gives you the name of the object referred to by a path, while dirName gives you the directory containing said object. Whether that object is a file or a directory is irrelevant. (And if you need to know what it is, there is always std.file.isDir and isFile.)
After some more thought, I think you are right on this point. Precisely because of possible trailing '/'. If OSes were clearer and more consistent, then we could and certainly should make a useful semantic distinction. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 07 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message 
news:il2hsp$89d$2 digitalmars.com...
 On Mon, 07 Mar 2011 10:25:21 +0000, Regan Heath wrote:

 On Sun, 06 Mar 2011 08:37:15 -0000, Rainer Schuetze <r.sagitario gmx.de>
 wrote:

 Looks good overall. I have a few comments and nitpicks though:

  >   basename("dir/subdir/")             -->  "subdir"
  >   directory("dir/subdir/")      -->  "dir"

 Is this what everybody expects? I'm not sure, but another possibility
 would be to treat these as if "dir/subdir/." is passed. What is the
 result of directory("/") or directory("d:/")?
?? I would expect: directory("dir/subdir/") --> "dir/subdir" as subdir _is_ a dir, not a file, as shown by the trailing slash. If it was: directory("dir/subdir") --> "dir" as subdir is perhaps not a directory, as there is no trailing slash. I realise this means the trailing slash becomes important, but it kinda is important as it does tell us when something is definitely a directory.
I don't think it does, or rather, I don't think there is such a thing as "definitely a directory". What about a symlink to a directory, for instance? On one hand, it *is* a file that contains a reference to a directory, and on the other, in most respects it *acts like* a directory. You can even argue that a "file" is simply the term used for a node in the filesystem tree, and that "directory" is a special kind of file that contains a list of other files. This terminology is pretty standard in *NIX land, at least. (Just google "everything is a file".)
That's true on windows too: "Note that a directory is simply a file with a special attribute designating it as a directory..." http://msdn.microsoft.com/en-us/library/aa365247%28v=VS.85%29.aspx#file_and_directory_names
Mar 07 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday 03 March 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
I hate to be nitpicky, but I notice that you're the only author listed for this module. The current std.path has several authors - none of which are you. So, unless you rewrote all of the code from scratch (which you may have done), you really should put the other names on it too (though if you rewrote it thoroughly enough, they may have very little left in it that they did; unfortunately, without knowing who wrote what, you need to put all of their names on it if any of the original code is there). - Jonathan M Davis
Mar 06 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 06 Mar 2011 01:21:56 -0800, Jonathan M Davis wrote:

 On Thursday 03 March 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
 
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
I hate to be nitpicky, but I notice that you're the only author listed for this module. The current std.path has several authors - none of which are you. So, unless you rewrote all of the code from scratch (which you may have done), you really should put the other names on it too (though if you rewrote it thoroughly enough, they may have very little left in it that they did; unfortunately, without knowing who wrote what, you need to put all of their names on it if any of the original code is there).
Everything you see in that module is completely rewritten from scratch. I started out by trying to make changes to the original std.path, but quickly found that I had to change so much it was better to start with a clean slate. As long as the module is a part of my own library, and doesn't contain anyone else's code, I'll only put my name on it. When it gets included in Phobos, and I add the remaining functions (fcmp, fnmatch, fncharmatch and expandTilde), I will of course be sure to list all authors. -Lars
Mar 06 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 03:36:50 Lars T. Kyllingstad wrote:
 On Sun, 06 Mar 2011 01:21:56 -0800, Jonathan M Davis wrote:
 On Thursday 03 March 2011 08:29:00 Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
     http://kyllingen.net/code/ltk/doc/path.html
     https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
 
 Features:
 
 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].
 
 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at
 the unittests for a more complete picture.
 
 - Saner naming scheme.  (Still not set in stone, of course.)
I hate to be nitpicky, but I notice that you're the only author listed for this module. The current std.path has several authors - none of which are you. So, unless you rewrote all of the code from scratch (which you may have done), you really should put the other names on it too (though if you rewrote it thoroughly enough, they may have very little left in it that they did; unfortunately, without knowing who wrote what, you need to put all of their names on it if any of the original code is there).
Everything you see in that module is completely rewritten from scratch. I started out by trying to make changes to the original std.path, but quickly found that I had to change so much it was better to start with a clean slate. As long as the module is a part of my own library, and doesn't contain anyone else's code, I'll only put my name on it. When it gets included in Phobos, and I add the remaining functions (fcmp, fnmatch, fncharmatch and expandTilde), I will of course be sure to list all authors.
That makes sense. It's just that if you didn't rewrite it from scratch, the previous authors would need to be there, and we don't want to mess up on copyright notices, since that could conveivably cause problems at some point if we do mess them up. - Jonathan M Davis
Mar 06 2011
prev sibling next sibling parent Jim <bitcirkel yahoo.com> writes:
Lars T. Kyllingstad Wrote:

 On Sat, 05 Mar 2011 14:33:07 -0800, Jonathan M Davis wrote:
 
 On Saturday 05 March 2011 08:32:55 Lars T. Kyllingstad wrote:
 On Fri, 04 Mar 2011 08:14:44 -0500, Nick Sabalausky wrote:
 "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> wrote in message
 news:ikofkc$322$1 digitalmars.com...
 
 As mentioned in the "std.path.getName(): Screwy by design?" thread,
 I started working on a rewrite of std.path a long time ago, but I
 got sidetracked by other things.  The recent discussion got me
 working on it again, and it turned out there wasn't that much left
 to be done.
 
 So here it is, please comment:
    http://kyllingen.net/code/ltk/doc/path.html
    https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I don't want to jinx it, but there seems to be a lot of agreement in this thread. Seriously, how often does that happen around here? :)
Not too often, so I take it as a good sign that I'm onto something. ;) The only disagreement seems to be about the naming, so let's have a round of voting. Here are a few alternatives for each function. Please say which ones you prefer. * dirSeparator, dirSep, sep
dirSep and pathSep. Having Separator in the name is unnecessarily long.
  * currentDirSymbol, currentDirSym, curDirSymbol
currDirSym and parentDirSym (and currDirSymbol and parentDirSymbol if abbreviating both current and symbol is too much). Shorter but still quite clear. I would _definitely_ use two r's when abbreviating current though, since current has two r's. I confess that it' a major pet peeve of mine when I see current abbreviate with one r. It feels like it's being spelled wrong, since current has two r's.
  * basename, baseName, filename, fileName
baseName
  * dirname, dirName, directory, getDir, getDirName
dirName
  * drivename, driveName, drive, getDrive, getDriveName
driveLetter would probably be better actually - though it _could_ be more than one letter if someone has an insane number of drives (it's usually referred to as a drive letter though). Barring that, drive would be fine (as long as it's a property).
Interestingly, it seems drive names are actually restricted to one letter. See the last paragraph of this section: http://en.wikipedia.org/wiki/Drive_letter#Common_assignments -Lars
Drive names in AmigaOS are longer by default iirc. Anyway, Microsoft might someday depart from the idea of drive letters. Whether they might support longer names or just abandon drive identifiers altogether is of course insolubly unknown, but I think names are at least a more general concept than letters (so as not to lock ourselves in with a passing coherence).
Mar 06 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 16:54:41 spir wrote:
 On 03/07/2011 01:44 AM, Jonathan M Davis wrote:
 I think whatever you choose will not please everybody, so just choose
 
  something and stick with it. Regarding all the extension naming
  stuff, I suggest you go with the "suffix" nomenclature which is more
  general and applicable to all OSs.
I agree with Lars on this one. Everyone knows what an extension is. It's a universal concept even if it's not used as much on non-Windows OSes. There _are_ plenty of programs in *nix which use it internally (likely because it's a lot easier than dealing with mime type) even if they shouldn't.
eg: numerous compilers, programming editors,... ;-)
The one that really bit me IIRC was Audacious. I had some newly ripped music files which it wouldn't play. As it turns out, the problem was that I had had to redo the settings on my ripping program shortly before, and I had forgotten to put the extension in the file name, so the newly ripped files had no extensions, and Audiacious apparently used the extension to determine whether it could play a particular file. So, of course, it wouldn't play my files, since they had no extensions. Unfortunately, it took me quite a while to figure that out, and I ended up on a bit of a wild goose chase in the interim... This reminds me. I should look into mime types one of these days to see what the appropriate way (if any) would be to put support for them in Phobos. It would be nice to not have to go by extension for the few programs that I have which have to worry about file type. - Jonathan M Davis
Mar 06 2011
parent reply "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 This reminds me. I should look into mime types one of these days to see 
 what the
 appropriate way (if any) would be to put support for them in Phobos. It 
 would be
 nice to not have to go by extension for the few programs that I have which 
 have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
Mar 06 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 
 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way. - Jonathan M Davis
Mar 06 2011
next sibling parent reply Christopher Nicholson-Sauls <ibisbasenji gmail.com> writes:
On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...

 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way. - Jonathan M Davis
As someone who uses hex editors quite a bit (resorting these days to using Okteta mainly), I can tell you I have yet to see any file's mime embedded at the beginning, nor have I seen it in any headers/nodes when scanning raw. Doesn't mean it's impossible of course, and certain file systems certainly might do this[1] but I haven't seen it yet[2]. You are quite right, though, that extension doesn't matter at all, except in certain corner cases. Even then, they are reasonable and predictable things -- like SO's having the right extension. Considering the posix convention of "hiding" files/directories by starting the name with a dot, it'd be hard to rely on extensions in any naive way anyhow. ;) -- Chris N-S [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or something similar to it at least. [2] Also not saying I wouldn't want to see it, necessarily. Done right, it'd be a damn nifty thing.
Mar 06 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 
 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way. - Jonathan M Davis
As someone who uses hex editors quite a bit (resorting these days to using Okteta mainly), I can tell you I have yet to see any file's mime embedded at the beginning, nor have I seen it in any headers/nodes when scanning raw. Doesn't mean it's impossible of course, and certain file systems certainly might do this[1] but I haven't seen it yet[2]. You are quite right, though, that extension doesn't matter at all, except in certain corner cases. Even then, they are reasonable and predictable things -- like SO's having the right extension. Considering the posix convention of "hiding" files/directories by starting the name with a dot, it'd be hard to rely on extensions in any naive way anyhow. ;) -- Chris N-S [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or something similar to it at least. [2] Also not saying I wouldn't want to see it, necessarily. Done right, it'd be a damn nifty thing.
I've never studied mime types, so I don't know much about them. It's just that it was my understanding the the first few bytes in a file indicated its mime type. If that isn't the case, I have no idea how you determine the mime type of a file or what's involved in doing so. I _would_, however, like to have a way to get a file's mime type in D, so one of these days, I'll likely be looking into the matter. - Jonathan M Davis
Mar 06 2011
next sibling parent reply Johannes Pfau <spam example.com> writes:
Jonathan M Davis wrote:
On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
=20
 This reminds me. I should look into mime types one of these days
 to see what the
 appropriate way (if any) would be to put support for them in
 Phobos. It would be
 nice to not have to go by extension for the few programs that I
 have which have
 to worry about file type.
=20 I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
=20 I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way. =20 - Jonathan M Davis
=20 As someone who uses hex editors quite a bit (resorting these days to using Okteta mainly), I can tell you I have yet to see any file's mime embedded at the beginning, nor have I seen it in any headers/nodes when scanning raw. Doesn't mean it's impossible of course, and certain file systems certainly might do this[1] but I haven't seen it yet[2]. =20 You are quite right, though, that extension doesn't matter at all, except in certain corner cases. Even then, they are reasonable and predictable things -- like SO's having the right extension. Considering the posix convention of "hiding" files/directories by starting the name with a dot, it'd be hard to rely on extensions in any naive way anyhow. ;) =20 -- Chris N-S =20 [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or something similar to it at least. =20 [2] Also not saying I wouldn't want to see it, necessarily. Done right, it'd be a damn nifty thing.
I've never studied mime types, so I don't know much about them. It's just that it was my understanding the the first few bytes in a file indicated its mime type. If that isn't the case, I have no idea how you determine the mime type of a file or what's involved in doing so. I _would_, however, like to have a way to get a file's mime type in D, so one of these days, I'll likely be looking into the matter. - Jonathan M Davis
The mime type can be saved as meta data on some filesystems, but it's not in the file, it's an attribute: ----------------------------------------------------- Storing the MIME type using Extended Attributes An implementation MAY also get a file's MIME type from the user.mime_type extended attribute. The type given here should normally be used in preference to any guessed type, since the user is able to set it explicitly. Applications MAY choose to set the type when saving files. Since many applications and filesystems do not support extended attributes, implementations MUST NOT rely on this method being available. ----------------------------------------------------- If this method is not available, programs look at the content of files for specific patterns to guess the mime type. It's not the mime type that is saved in the file though. Consider an mp3 file: there's no "audio/mp3" in the file, but there always is a mp3 header. If a file is scanned and a mp3 header is found, it's safe to assume the mime type. Most file formats also have some kind of magic number at the beginning, so it's easier to detect those. More information: http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spe= c-latest.html=20 --=20 Johannes Pfau
Mar 07 2011
parent reply spir <denis.spir gmail.com> writes:
On 03/07/2011 09:19 AM, Johannes Pfau wrote:
 Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis"<jmdavisProg gmx.com>  wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...

 This reminds me. I should look into mime types one of these days
 to see what the
 appropriate way (if any) would be to put support for them in
 Phobos. It would be
 nice to not have to go by extension for the few programs that I
 have which have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way. - Jonathan M Davis
As someone who uses hex editors quite a bit (resorting these days to using Okteta mainly), I can tell you I have yet to see any file's mime embedded at the beginning, nor have I seen it in any headers/nodes when scanning raw. Doesn't mean it's impossible of course, and certain file systems certainly might do this[1] but I haven't seen it yet[2]. You are quite right, though, that extension doesn't matter at all, except in certain corner cases. Even then, they are reasonable and predictable things -- like SO's having the right extension. Considering the posix convention of "hiding" files/directories by starting the name with a dot, it'd be hard to rely on extensions in any naive way anyhow. ;) -- Chris N-S [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or something similar to it at least. [2] Also not saying I wouldn't want to see it, necessarily. Done right, it'd be a damn nifty thing.
I've never studied mime types, so I don't know much about them. It's just that it was my understanding the the first few bytes in a file indicated its mime type. If that isn't the case, I have no idea how you determine the mime type of a file or what's involved in doing so. I _would_, however, like to have a way to get a file's mime type in D, so one of these days, I'll likely be looking into the matter. - Jonathan M Davis
The mime type can be saved as meta data on some filesystems, but it's not in the file, it's an attribute: ----------------------------------------------------- Storing the MIME type using Extended Attributes An implementation MAY also get a file's MIME type from the user.mime_type extended attribute. The type given here should normally be used in preference to any guessed type, since the user is able to set it explicitly. Applications MAY choose to set the type when saving files. Since many applications and filesystems do not support extended attributes, implementations MUST NOT rely on this method being available. ----------------------------------------------------- If this method is not available, programs look at the content of files for specific patterns to guess the mime type. It's not the mime type that is saved in the file though. Consider an mp3 file: there's no "audio/mp3" in the file, but there always is a mp3 header. If a file is scanned and a mp3 header is found, it's safe to assume the mime type. Most file formats also have some kind of magic number at the beginning, so it's easier to detect those. More information: http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html
I would definitely love an inter-OS standard for storing the MIME-type in every file's first byte. Esp. the text encoding, when it's text (ask Walter why D only supports UTF's, and even then the cost in complexity just to determine which UTF (including byte-order!)). But we're not in such a world. And you can be sure that numerous (super C experts) would oppose this because of the space cost. Denis -- _________________ vita es estrany spir.wikidot.com
Mar 07 2011
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
spir wrote:
 I would definitely love an inter-OS standard for storing the
 MIME-type in every file's first byte.
A better solution would be to store it in the filename. Might want more detail than one byte could allow too, so perhaps allowing three or four bytes would be a good answer. With the type in the filename, you can determine it easily from a directory listing without needing to open every individual file. This would make a big difference in listing speed on a slow filesystem and by using the name, it is compatible with all systems too.
Mar 07 2011
next sibling parent "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 07 Mar 2011 15:07:59 -0000, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 spir wrote:
 I would definitely love an inter-OS standard for storing the
 MIME-type in every file's first byte.
A better solution would be to store it in the filename. Might want more detail than one byte could allow too, so perhaps allowing three or four bytes would be a good answer. With the type in the filename, you can determine it easily from a directory listing without needing to open every individual file. This would make a big difference in listing speed on a slow filesystem and by using the name, it is compatible with all systems too.
:P -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Mar 07 2011
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"Adam D. Ruppe" <destructionator gmail.com> wrote in message 
news:il2sce$11lv$1 digitalmars.com...
 spir wrote:
 I would definitely love an inter-OS standard for storing the
 MIME-type in every file's first byte.
A better solution would be to store it in the filename. Might want more detail than one byte could allow too, so perhaps allowing three or four bytes would be a good answer. With the type in the filename, you can determine it easily from a directory listing without needing to open every individual file. This would make a big difference in listing speed on a slow filesystem and by using the name, it is compatible with all systems too.
I agree, and have to say: Very well put :)
Mar 07 2011
prev sibling parent Bekenn <leaveme alone.com> writes:
On 3/7/2011 7:07 AM, Adam D. Ruppe wrote:
 A better solution would be to store it in the filename. Might
 want more detail than one byte could allow too, so perhaps allowing
 three or four bytes would be a good answer.

 With the type in the filename, you can determine it easily from
 a directory listing without needing to open every individual file.
 This would make a big difference in listing speed on a slow filesystem
 and by using the name, it is compatible with all systems too.
Along those same lines: http://blogs.msdn.com/b/oldnewthing/archive/2009/04/15/9549682.aspx
Mar 07 2011
prev sibling parent Lutger Blijdestijn <lutger.blijdestijn gmail.com> writes:
Jonathan M Davis wrote:

 On Sunday 06 March 2011 22:51:55 Christopher Nicholson-Sauls wrote:
 On 03/07/11 00:24, Jonathan M Davis wrote:
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 
 This reminds me. I should look into mime types one of these days to
 see what the
 appropriate way (if any) would be to put support for them in Phobos.
 It would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way. - Jonathan M Davis
As someone who uses hex editors quite a bit (resorting these days to using Okteta mainly), I can tell you I have yet to see any file's mime embedded at the beginning, nor have I seen it in any headers/nodes when scanning raw. Doesn't mean it's impossible of course, and certain file systems certainly might do this[1] but I haven't seen it yet[2]. You are quite right, though, that extension doesn't matter at all, except in certain corner cases. Even then, they are reasonable and predictable things -- like SO's having the right extension. Considering the posix convention of "hiding" files/directories by starting the name with a dot, it'd be hard to rely on extensions in any naive way anyhow. ;) -- Chris N-S [1] I'd just about expect the filesystem of BeOS/Haiku to do so, or something similar to it at least. [2] Also not saying I wouldn't want to see it, necessarily. Done right, it'd be a damn nifty thing.
I've never studied mime types, so I don't know much about them. It's just that it was my understanding the the first few bytes in a file indicated its mime type. If that isn't the case, I have no idea how you determine the mime type of a file or what's involved in doing so. I _would_, however, like to have a way to get a file's mime type in D, so one of these days, I'll likely be looking into the matter. - Jonathan M Davis
A good place to start is likely freedesktop.org, which maintains specifications, libraries and utilities aimed at enhancing interoperability between desktop systems. This is the page about mime types: http://freedesktop.org/wiki/Specifications/shared-mime-info-spec
Mar 07 2011
prev sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I've just reported two issues with std.path.join:
http://d.puremagic.com/issues/show_bug.cgi?id=5758
http://d.puremagic.com/issues/show_bug.cgi?id=5759

Does pathJoiner suffer from the same problems?
Mar 20 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.2639.1300648308.4748.digitalmars-d puremagic.com...
 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
Ugh, phobos has a real problem with ctfe. There's a lot that doesn't work as ctfe, but should. But worse than that, regressions with ctfe-ability seem to be extremely common.
 http://d.puremagic.com/issues/show_bug.cgi?id=5759

 Does pathJoiner suffer from the same problems?
Mar 20 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
 "Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message
 news:mailman.2639.1300648308.4748.digitalmars-d puremagic.com...
 
 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
Ugh, phobos has a real problem with ctfe. There's a lot that doesn't work as ctfe, but should. But worse than that, regressions with ctfe-ability seem to be extremely common.
Probably because CTFE is a bit of a black art with regards to what works and what doesn't. So, it's not always obvious when something will be CTFE-able or not. I expect that the only way to really solve this is to decide which functions must be CTFE-able and add unit tests which fail if they aren't. As it becomes possible to make more functions CTFE-able, they can be made CTFE- able and have the appropriate unit tests added. But as long as none of the Phobos devs are really worrying about whether functions are CTFE-able or not (and I don't get the impression that we generally I - I certainly don't think about it most of the time), they're _not_ going to notice whether the CTFE- ability of a function changes. Though honestly, enough of Phobos is in flux and CTFE is enough of a black box that I'm not sure that it's yet entirely reasonable to require that Phobos functions stay CTFE-able once they're CTFE- able. It could be that fixing a particular bug or improving the overall design of a portion of Phobos could easily result in something becoming non-CTFE- able. Regardless, in the long run (if not the short run), this issue does need to be addressed, and Phobos devs should likely be more aware of it in general. - Jonathan M Davis
Mar 20 2011
prev sibling parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sun, 20 Mar 2011 20:11:36 +0100, Andrej Mitrovic wrote:

 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
 http://d.puremagic.com/issues/show_bug.cgi?id=5759
 
 Does pathJoiner suffer from the same problems?
Are you referring to joinPath() in my code? If so, no. It works at compile time, and correctly joins the paths. -Lars
Mar 20 2011
next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 3/20/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 On Sun, 20 Mar 2011 20:11:36 +0100, Andrej Mitrovic wrote:

 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
 http://d.puremagic.com/issues/show_bug.cgi?id=5759

 Does pathJoiner suffer from the same problems?
Are you referring to joinPath() in my code? If so, no. It works at compile time, and correctly joins the paths. -Lars
Fantastic. What about issue 5759, can it work properly so: joinPath(curdir, r"\subdir\") == r".\subdir\" Maybe that's Windows-only, dunno.
Mar 20 2011
prev sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 3/20/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 On 3/20/11, Lars T. Kyllingstad <public kyllingen.nospamnet> wrote:
 On Sun, 20 Mar 2011 20:11:36 +0100, Andrej Mitrovic wrote:

 I've just reported two issues with std.path.join:
 http://d.puremagic.com/issues/show_bug.cgi?id=5758
 http://d.puremagic.com/issues/show_bug.cgi?id=5759

 Does pathJoiner suffer from the same problems?
Are you referring to joinPath() in my code? If so, no. It works at compile time, and correctly joins the paths. -Lars
Fantastic. What about issue 5759, can it work properly so: joinPath(curdir, r"\subdir\") == r".\subdir\" Maybe that's Windows-only, dunno.
Sorry, I'm stupid and didn't read the entirety of your post. It does work if you said so. :)
Mar 20 2011
prev sibling parent "Nick Sabalausky" <a a.a> writes:
"Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
news:mailman.2298.1299479088.4748.digitalmars-d puremagic.com...
 On Sunday 06 March 2011 22:09:22 Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...

 This reminds me. I should look into mime types one of these days to see
 what the
 appropriate way (if any) would be to put support for them in Phobos. It
 would be
 nice to not have to go by extension for the few programs that I have
 which have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
I thought that the first few bytes of the file _were_ the mime type. Certainly, from what I've seen, extension has _no_ effect on most programs. Konqueror certainly acts like it does everything by mime type - file associations are set that way.
No, MIME is a text-based filetype-naming system thst originated from SMTP and then got adopted by HTTP and various other things. It's like a really verbose file extension that isn't stored as part of the filename. These are some MIME types: application/json application/soap+xml application/xhtml+xml application/x-gzip image/jpeg text/plain text/xml video/mp4 application/x-www-form-urlencoded More info: http://en.wikipedia.org/wiki/Mime_type
Mar 07 2011
prev sibling parent Christopher Nicholson-Sauls <ibisbasenji gmail.com> writes:
On 03/07/11 00:09, Nick Sabalausky wrote:
 "Jonathan M Davis" <jmdavisProg gmx.com> wrote in message 
 news:mailman.2280.1299459971.4748.digitalmars-d puremagic.com...
 This reminds me. I should look into mime types one of these days to see 
 what the
 appropriate way (if any) would be to put support for them in Phobos. It 
 would be
 nice to not have to go by extension for the few programs that I have which 
 have
 to worry about file type.
I'm no unix expert, but my understanding is that mime types in the filesystem don't even exist at all, and that what it *really* does is use some complex black-box-ish algorithm that takes into account the first few bytes of the file, the extention, the exec flag, and god-knows-what-else to determine what type of file it is. Contrary to how people keep making it sound, mime type is *not* the determining factor (and cannot possibly be), but rather nothing more than the way the *result* of all that analysis is represented.
One could likely get a good grip of the "black box" by studying the source of the common "file" utility. It can be surprisingly detailed in some cases, such as the following real example: $ file debug.log debug.log: UTF-8 Unicode English text, with very long lines It does generate mime types as well: $ file -bi debug.log text/plain; charset=utf-8 -- Chris N-S
Mar 06 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 03/07/2011 02:06 AM, Jonathan M Davis wrote:
 On Sunday 06 March 2011 16:54:41 spir wrote:
 On 03/07/2011 01:44 AM, Jonathan M Davis wrote:
 I think whatever you choose will not please everybody, so just choose

   something and stick with it. Regarding all the extension naming
   stuff, I suggest you go with the "suffix" nomenclature which is more
   general and applicable to all OSs.
I agree with Lars on this one. Everyone knows what an extension is. It's a universal concept even if it's not used as much on non-Windows OSes. There _are_ plenty of programs in *nix which use it internally (likely because it's a lot easier than dealing with mime type) even if they shouldn't.
eg: numerous compilers, programming editors,... ;-)
The one that really bit me IIRC was Audacious. I had some newly ripped music files which it wouldn't play. As it turns out, the problem was that I had had to redo the settings on my ripping program shortly before, and I had forgotten to put the extension in the file name, so the newly ripped files had no extensions, and Audiacious apparently used the extension to determine whether it could play a particular file. So, of course, it wouldn't play my files, since they had no extensions. Unfortunately, it took me quite a while to figure that out, and I ended up on a bit of a wild goose chase in the interim... This reminds me. I should look into mime types one of these days to see what the appropriate way (if any) would be to put support for them in Phobos. It would be nice to not have to go by extension for the few programs that I have which have to worry about file type.
I'd say: MIME types are another wild goose chase field ;-) Denis -- _________________ vita es estrany spir.wikidot.com
Mar 06 2011
prev sibling next sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 03/03/2011 16:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on it
 again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d

 Features:

 - Most functions work with all string types, i.e. all permutations of
 mutable/const/immutable(char/wchar/dchar)[].  Notable exceptions are
 toAbsolute() and toCanonical, because they rely on std.file.getcwd()
 which returns an immutable(char)[].

 - Correct behaviour in corner cases that aren't covered by the current
 std.path.  See the other thread for some examples, or take a look at the
 unittests for a more complete picture.

 - Saner naming scheme.  (Still not set in stone, of course.)

 -Lars
I hope I'm not too late for the party, especially because I do have a bit of criticism for this one... Looking at the DDoc page, this module seem to have very platform-dependent behavior. I find this detrimental, even unsavory. I think it's best that programs work with internal data structures that are as platform-independent as possible, and only convert to platform-dependent data or API at the very last possible moment, when so required (ie, when interfacing with the actual OS, or with the user). So, with that in mind, there is a toCanonical function that converts to a OS specific format, but there's no function to convert to an OS/platform independent format?... :S Also, what does dirName( "d:file") return on POSIX? Is it the same as on Windows? I hope so, and that such behavior is explicitly part of the API and not just accidental. (I don't a linux machine nearby to try it out myself) Because, what if I want to refer to Windows paths from a POSIX application? (I'm sure there are scenarios where that makes sense) Or what if I just want my application to behave in a pedantically platform-identical way, like having it to accept backlashes as path separators not just on Windows but on POSIX as well? (This makes much more sense than is immediately obvious... in many cases it can be argued to be the Right Thing) I'm sorry if I seem a bit agitated :P , it's just that due to some more or less recent traumatizing events (a long story relating to Windows 7) I have become a Crusader for cross-platformness. The other suggestion I have (mentioned by others as well) is to generalize the driver letter to a device symbol/string/identifier. But this only makes sense if this device segment works in a platform-independent way. This generalization might make the path module useful in a few new contexts. Note, I'm not saying it should handle URIs, in fact I want to explicitly say it should not handle URIs, as URIs have additional semantics (query and fragment parts, the percent encoding, etc.) which should not be of concern here. BTW, I admit I take some inspiration from this API: http://help.eclipse.org/helios/index.jsp?topic=/org.eclipse.platform.doc.isv/reference/api/org/eclipse/core/runtime/IPath.html Note that here there is only *one* platform dependent function, the aptly named toOSString() ... -- Bruno Medeiros - Software Engineer
Apr 06 2011
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Wed, 06 Apr 2011 15:51:15 +0100, Bruno Medeiros wrote:

 On 03/03/2011 16:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.

 So here it is, please comment:

      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I hope I'm not too late for the party, especially because I do have a bit of criticism for this one...
Not at all. Reviews of, and further work on, std.path has been put on hold until I have handed in my PhD thesis (which, if all goes well, should be very soon). I haven't got time to participate in any extensive discussions on the NG right now. So there will be ample opportunity to comment on the design yet. :)
 Looking at the DDoc page, this module seem to have very
 platform-dependent behavior. I find this detrimental, even unsavory. I
 think it's best that programs work with internal data structures that
 are as platform-independent as possible, and only convert to
 platform-dependent data or API at the very last possible moment, when so
 required (ie, when interfacing with the actual OS, or with the user).
 
 So, with that in mind, there is a toCanonical function that converts to
 a OS specific format, but there's no function to convert to an
 OS/platform independent format?... :S
 
 Also, what does dirName( "d:file") return on POSIX? Is it the same as on
 Windows? I hope so, and that such behavior is explicitly part of the API
 and not just accidental. (I don't a linux machine nearby to try it out
 myself) Because, what if I want to refer to Windows paths from a POSIX
 application? (I'm sure there are scenarios where that makes sense)
 
 Or what if I just want my application to behave in a pedantically
 platform-identical way, like having it to accept backlashes as path
 separators not just on Windows but on POSIX as well? (This makes much
 more sense than is immediately obvious... in many cases it can be argued
 to be the Right Thing)
 
 
 I'm sorry if I seem a bit agitated :P , it's just that due to some more
 or less recent traumatizing events (a long story relating to Windows 7)
 I have become a Crusader for cross-platformness.
 
 
 The other suggestion I have (mentioned by others as well) is to
 generalize the driver letter to a device symbol/string/identifier. But
 this only makes sense if this device segment works in a
 platform-independent way. This generalization might make the path module
 useful in a few new contexts. Note, I'm not saying it should handle
 URIs, in fact I want to explicitly say it should not handle URIs, as
 URIs have additional semantics (query and fragment parts, the percent
 encoding, etc.) which should not be of concern here.
 
 BTW, I admit I take some inspiration from this API:
 http://help.eclipse.org/helios/index.jsp?topic=/
org.eclipse.platform.doc.isv/reference/api/org/eclipse/core/runtime/ IPath.html
 Note that here there is only *one* platform dependent function, the
 aptly named toOSString() ...
Thanks for the feedback, I will read it more thoroughly when I take up work on std.path again. Just a general comment, though: Having the exact same functionality on Windows and POSIX just doesn't work, if nothing else simply because "c:\dir\file" is a valid base name on POSIX. That is, both ':' and '\' are valid filename characters. The ONLY invalid filename characters on POSIX are '/' and '\0'. Yes, weird file names like that may be uncommon, but the library should be able to handle them nonetheless. -Lars
Apr 07 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
 On Wed, 06 Apr 2011 15:51:15 +0100, Bruno Medeiros wrote:
 On 03/03/2011 16:29, Lars T. Kyllingstad wrote:
 As mentioned in the "std.path.getName(): Screwy by design?" thread, I
 started working on a rewrite of std.path a long time ago, but I got
 sidetracked by other things.  The recent discussion got me working on
 it again, and it turned out there wasn't that much left to be done.
 
 So here it is, please comment:
      http://kyllingen.net/code/ltk/doc/path.html
      https://github.com/kyllingstad/ltk/blob/master/ltk/path.d
I hope I'm not too late for the party, especially because I do have a bit of criticism for this one...
Not at all. Reviews of, and further work on, std.path has been put on hold until I have handed in my PhD thesis (which, if all goes well, should be very soon). I haven't got time to participate in any extensive discussions on the NG right now. So there will be ample opportunity to comment on the design yet. :)
 Looking at the DDoc page, this module seem to have very
 platform-dependent behavior. I find this detrimental, even unsavory. I
 think it's best that programs work with internal data structures that
 are as platform-independent as possible, and only convert to
 platform-dependent data or API at the very last possible moment, when so
 required (ie, when interfacing with the actual OS, or with the user).
 
 So, with that in mind, there is a toCanonical function that converts to
 a OS specific format, but there's no function to convert to an
 OS/platform independent format?... :S
 
 Also, what does dirName( "d:file") return on POSIX? Is it the same as on
 Windows? I hope so, and that such behavior is explicitly part of the API
 and not just accidental. (I don't a linux machine nearby to try it out
 myself) Because, what if I want to refer to Windows paths from a POSIX
 application? (I'm sure there are scenarios where that makes sense)
 
 Or what if I just want my application to behave in a pedantically
 platform-identical way, like having it to accept backlashes as path
 separators not just on Windows but on POSIX as well? (This makes much
 more sense than is immediately obvious... in many cases it can be argued
 to be the Right Thing)
 
 
 I'm sorry if I seem a bit agitated :P , it's just that due to some more
 or less recent traumatizing events (a long story relating to Windows 7)
 I have become a Crusader for cross-platformness.
 
 
 The other suggestion I have (mentioned by others as well) is to
 generalize the driver letter to a device symbol/string/identifier. But
 this only makes sense if this device segment works in a
 platform-independent way. This generalization might make the path module
 useful in a few new contexts. Note, I'm not saying it should handle
 URIs, in fact I want to explicitly say it should not handle URIs, as
 URIs have additional semantics (query and fragment parts, the percent
 encoding, etc.) which should not be of concern here.
 
 BTW, I admit I take some inspiration from this API:
 http://help.eclipse.org/helios/index.jsp?topic=/
org.eclipse.platform.doc.isv/reference/api/org/eclipse/core/runtime/ IPath.html
 Note that here there is only *one* platform dependent function, the
 aptly named toOSString() ...
Thanks for the feedback, I will read it more thoroughly when I take up work on std.path again. Just a general comment, though: Having the exact same functionality on Windows and POSIX just doesn't work, if nothing else simply because "c:\dir\file" is a valid base name on POSIX. That is, both ':' and '\' are valid filename characters. The ONLY invalid filename characters on POSIX are '/' and '\0'. Yes, weird file names like that may be uncommon, but the library should be able to handle them nonetheless.
And on some file systems, even / is valid! Though it's not worth it to try and get std.path to work with files with / in the name. It's generally a very bad idea to create a file with a / in the name - too many programs would choke on it or just plain have the wrong behavior. However, there _are_ *nix file systems which allow for / in file names. - Jonathan M Davis
Apr 07 2011
parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Thu, 07 Apr 2011 03:57:18 -0700, Jonathan M Davis wrote:
 
 And on some file systems, even / is valid! Though it's not worth it to
 try and get std.path to work with files with / in the name. It's
 generally a very bad idea to create a file with a / in the name - too
 many programs would choke on it or just plain have the wrong behavior.
 However, there _are_ *nix file systems which allow for / in file names.
Which filesystems are those? The POSIX:2008 specification specifically states that "The characters composing the name may be selected from the set of all character values excluding the <slash> character and the null byte." where <slash> is defined as '/'. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html -Lars
Apr 07 2011
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
On 07/04/2011 09:32, Lars T. Kyllingstad wrote:
 On Wed, 06 Apr 2011 15:51:15 +0100, Bruno Medeiros wrote:

 Thanks for the feedback, I will read it more thoroughly when I take up
 work on std.path again.  Just a general comment, though:  Having the
 exact same functionality on Windows and POSIX just doesn't work, if
 nothing else simply because "c:\dir\file" is a valid base name on POSIX.
 That is, both ':' and '\' are valid filename characters.  The ONLY
 invalid filename characters on POSIX are '/' and '\0'.

 Yes, weird file names like that may be uncommon, but the library should
 be able to handle them nonetheless.

 -Lars
Yeah, that's a good point. I'm sure yet if there is a good way that could address both issues, I want to think about it more later. (in Eclipse's IPath this is less of a problem because that API works with a path data type, not with a path string directly) -- Bruno Medeiros - Software Engineer
Apr 13 2011
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-04-07 04:38, Lars T. Kyllingstad wrote:
 On Thu, 07 Apr 2011 03:57:18 -0700, Jonathan M Davis wrote:
 And on some file systems, even / is valid! Though it's not worth it to
 try and get std.path to work with files with / in the name. It's
 generally a very bad idea to create a file with a / in the name - too
 many programs would choke on it or just plain have the wrong behavior.
 However, there _are_ *nix file systems which allow for / in file names.
Which filesystems are those?  The POSIX:2008 specification specifically states that     "The characters composing the name may be selected from      the set of all character values excluding the <slash>      character and the null byte." where <slash> is defined as '/'. http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
I didn't know that Posix had anything to say on the matter (though it doesn't hurt my feelings any that it effectively says that / isn't valid in file names). However, the file systems themselves apparently don't necessarily stick to that. If you take a look at http://en.wikipedia.org/wiki/Comparison_of_file_systems you can see which file systems allow which characters. For instance, the exts disallow NUL and /. However ReiserFS, Btrfs, JFS, and XFS allow /. In fact, most of the Linux file systems seem to allow / (though the exts are probably the most used and they don't). Still, Posix or no, I would expect that using / in a file name would be just asking for trouble and find no reason to support it in std.path (particularly when we'd rely on the underlying C calls handling it appropriately, and I expect that there's a good chance that they don't). But if Posix disallows it, then we definitely shouldn't. Still, the file systems themselves aren't necessarily Posix-related, and apparently quite a few of the *nix file systems allow /. - Jonathan M Davis
Apr 07 2011