www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - std.file.dirEntries unsorted

reply Timothee Cour <thelastmammoth gmail.com> writes:
dirEntries depends on readdir, which has undefined order (eg:
http://stackoverflow.com/questions/8977441/does-readdir-guarantee-an-order,
and I've experienced as well dirEntries in non-alphabetical order)

shouldn't we make dirEntries return in alphabetical order by default, with
an option to return in unspecified native order for efficiency?

at least, it should be specified in doc that order is undefined.
Dec 10 2013
parent reply "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Wednesday, 11 December 2013 at 02:11:51 UTC, Timothee Cour 
wrote:
 dirEntries depends on readdir, which has undefined order (eg:
 http://stackoverflow.com/questions/8977441/does-readdir-guarantee-an-order,
 and I've experienced as well dirEntries in non-alphabetical 
 order)

 shouldn't we make dirEntries return in alphabetical order by 
 default, with
 an option to return in unspecified native order for efficiency?

 at least, it should be specified in doc that order is undefined.
It should only be documented. In my experience processing files don't need a particular order and sorting may not be needed by name. Returning a sorted directory is difficult to define, should directories come first or be mixed with the files. Is uppercase grouped together? Does A come before a. Should the extension be included or postponed for later. I think sorting should be explicit.
Dec 10 2013
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 11 Dec 2013 08:00:27 +0100
schrieb "Jesse Phillips" <Jesse.K.Phillips+D gmail.com>:

 It should only be documented. In my experience processing files=20
 don't need a particular order and sorting may not be needed by=20
 name.
=20
 Returning a sorted directory is difficult to define, should=20
 directories come first or be mixed with the files. Is uppercase=20
 grouped together? Does A come before a. Should the extension be=20
 included or postponed for later.
=20
 I think sorting should be explicit.
Does 2.jpg come after 10.jpg ? What's the order of Arabic-Indic "one" =DB=B1 compared to "Latin one" 1 ? And so on and so forth. --=20 Marco
Dec 11 2013
next sibling parent reply Timothee Cour <thelastmammoth gmail.com> writes:
yes, I agree sorting should be explicit as there's no natural order.
However sorting after calling dirEntries is not great as typically one
wants to sort within a given directory level and it's too late to sort once
all the directory levels are flattened.
so how about having an extra argument that takes a lambda (eg
binaryFun!"a<b") in dirEntries, or, having an additional function in
std.file that takes such lambda.


On Wed, Dec 11, 2013 at 2:31 AM, Marco Leise <Marco.Leise gmx.de> wrote:

 Am Wed, 11 Dec 2013 08:00:27 +0100
 schrieb "Jesse Phillips" <Jesse.K.Phillips+D gmail.com>:

 It should only be documented. In my experience processing files
 don't need a particular order and sorting may not be needed by
 name.

 Returning a sorted directory is difficult to define, should
 directories come first or be mixed with the files. Is uppercase
 grouped together? Does A come before a. Should the extension be
 included or postponed for later.

 I think sorting should be explicit.
Does 2.jpg come after 10.jpg ? What's the order of Arabic-Indic "one" =DB=B1 compared to "Latin one" 1 ? And so on and so forth. -- Marco
Dec 11 2013
parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Wednesday, 11 December 2013 at 18:34:54 UTC, Timothee Cour 
wrote:
 yes, I agree sorting should be explicit as there's no natural 
 order.
 However sorting after calling dirEntries is not great as 
 typically one
 wants to sort within a given directory level and it's too late 
 to sort once
 all the directory levels are flattened.
 so how about having an extra argument that takes a lambda (eg
 binaryFun!"a<b") in dirEntries, or, having an additional 
 function in
 std.file that takes such lambda.
Why is it too late, the file name includes the full path so sorting will still sort sibling directories separately. foreach(de; dirEntries(".", SpaneMode.depth).array.sort!((a,b)=>a.name<b.name)) ... This seems reasonable for your need, but I didn't test to check the behavior. dirEntries isn't random access so we can't sort it directly. I don't think placing it in dirEntries saves much and it would hide the required array allocation.
Dec 11 2013
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, December 11, 2013 10:34:29 Timothee Cour wrote:
 yes, I agree sorting should be explicit as there's no natural order.
 However sorting after calling dirEntries is not great as typically one
 wants to sort within a given directory level and it's too late to sort once
 all the directory levels are flattened.
 so how about having an extra argument that takes a lambda (eg
 binaryFun!"a<b") in dirEntries, or, having an additional function in
 std.file that takes such lambda.
You can use SpanMode.shallow if you just want to look at a directory at a time - or as Jesse points out, you could sort on the entire path. Regardless, dirEntries can't sort for you, because in order to do that it would have to allocate a container of some kind (be it an array or something else) in order to hold all of the entries and then sort them, whereas dirEntries is lazy and doesn't hold anything other than the info on where it currently is in the list of directories. The actual file list is held by the OS. So, you might as well just implement what you want on top of dirEntries. What you're asking for is already essentially a wrapper around dirEntries - one that would have to allocate on the heap no less - so it really makes more sense for it to be done outside of dirEntries. - Jonathan M Davis
Dec 11 2013