www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Regarding implementing a stable sort for Phobos

reply "Xinok" <xinok live.com> writes:
I've been playing with sorting algorithms a lot in recent months, 
so I want to implement a *working* stable sort for Phobos which 
is broken at the moment. I have a working library and I'm still 
adding to it. It's much more complex than a simple merge sort, 
being over 300 lines of code at the moment.

- It's a natural merge sort, which is faster on partially sorted 
lists, and adds little overhead for mostly random lists.
- It uses O(log n log n) additional space for merging.
- I wrote it to sort random-access ranges *without* slicing, but 
I think the exclusion of slicing makes it slower. I'm writing a 
separate implementation which uses slicing and I'll keep it if 
it's much faster.
- To avoid multiple allocations, the user can allocate their own 
temporary memory and pass it to the sort function.
- I decided against using pointers. While I could use them to 
write highly optimized code for arrays, pointers can't be used in 
safe code and don't work very well in CTFE at the moment.

Is it worth keeping the implementation *without* slicing? Many 
functions in Phobos do require slicing, including the unstable 
sort, and I think most random-access ranges do or could support 
slicing.

What would you prefer in terms of memory usage vs performance? 
O(n/2) space is optimal for performance, O(1) (in-place) requires 
zero allocations but is slower, and O(log n log n) provides a 
good balance.

Should I implement concurrency? Merge sort is very easy to 
parallelize, and being in-place or natural doesn't change this 
fact.

Should we take a different path and go for a basic merge sort or 
even Timsort? I've considered writing a Timsort though that seems 
like daunting task to me, so here's an implementation written in 
C - https://github.com/swenson/sort
Mar 12 2012
next sibling parent reply Chad J <chadjoan __spam.is.bad__gmail.com> writes:
On 03/13/2012 02:31 AM, Xinok wrote:
 I've been playing with sorting algorithms a lot in recent months, so I
 want to implement a *working* stable sort for Phobos which is broken at
 the moment. I have a working library and I'm still adding to it. It's
 much more complex than a simple merge sort, being over 300 lines of code
 at the moment.

 ...

Hey, I'd love to see more sorting algorithms in phobos. Being stuck with one seems kind of... wrong. If the range is a linked list, shouldn't it be possible to do a merge sort with optimally in-place and use no extra memory whatsoever? I know it can be done in-place, but I've never benchmarked it. I wonder if it's worth considering, and how it would compare against array-based merge sort with allocations and such. Although it's probably out of your scope right now, I'd like to see insertion sort some day. I would use it for things like broadphase sorting in collision detection (that is, you sort everything by say, x coordinates first, and then you can walk through the whole simulation from left-to-right and have very few things to check collisions for at each point). Since the ordering of the objects in the simulation is unlikely to change much between frames, it will be almost entirely sorted each time. I have to imagine insertion sort would be awesome at that; nearly O(n). Maybe if it hits more than log(n) nonlocal insertions it would bail out into a merge sort or something.
Mar 12 2012
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 13/03/2012 07:53, Chad J a écrit :
 On 03/13/2012 02:31 AM, Xinok wrote:
 I've been playing with sorting algorithms a lot in recent months, so I
 want to implement a *working* stable sort for Phobos which is broken at
 the moment. I have a working library and I'm still adding to it. It's
 much more complex than a simple merge sort, being over 300 lines of code
 at the moment.

 ...

Hey, I'd love to see more sorting algorithms in phobos. Being stuck with one seems kind of... wrong.

I have a radix sort (that need some rework to be phobos quality) and a smoothsort (that could be included in phobos).
 If the range is a linked list, shouldn't it be possible to do a merge
 sort with optimally in-place and use no extra memory whatsoever? I know
 it can be done in-place, but I've never benchmarked it. I wonder if it's
 worth considering, and how it would compare against array-based merge
 sort with allocations and such.

I have a sort for ForwardRange, but it is O(n²) and unstable. However, it is in place. I don't think we should allocate behind one's back, so merge sort should be an option, unless called explicitely.
 Although it's probably out of your scope right now, I'd like to see
 insertion sort some day. I would use it for things like broadphase
 sorting in collision detection (that is, you sort everything by say, x
 coordinates first, and then you can walk through the whole simulation
 from left-to-right and have very few things to check collisions for at
 each point). Since the ordering of the objects in the simulation is
 unlikely to change much between frames, it will be almost entirely
 sorted each time. I have to imagine insertion sort would be awesome at
 that; nearly O(n). Maybe if it hits more than log(n) nonlocal insertions
 it would bail out into a merge sort or something.

smoothsort is a good solution for that. radix is also guarantee to be O(n). Insertionsort is quite risky, because it can ends up in O(n²) very easily.
Mar 13 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 13/03/2012 10:19, Xinok a écrit :
 On Tuesday, 13 March 2012 at 08:37:06 UTC, deadalnix wrote:
 I have a radix sort (that need some rework to be phobos quality) and a
 smoothsort (that could be included in phobos).

Would you mind sharing your smoothsort? I haven't implemented one myself and I'd love to test it out.

It is on github : https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d
 Radix sort, on the other hand, is not a comparison sort. You'd have to
 rewrite it for every possible element and container type.

You can do quite a lot with a bijective transformation.
Mar 13 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 13/03/2012 16:08, Xinok a écrit :
 On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:
 Le 13/03/2012 10:19, Xinok a écrit :
 Would you mind sharing your smoothsort? I haven't implemented one myself
 and I'd love to test it out.

It is on github : https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d

Thanks. I found a couple cases where it performs better, but overall, the overhead of the algorithm seems to be too much and most other algorithms performed better. While some need to be rewritten, I have a slew of algorithms if you want them for your project.

smooth sort is intended to be used on semi sorted data (like transparent polygons on a 3D scene). Ideal to keep some data sorted. It also have a guarantee to run in O(n*log(n)). But qsort variation (like we have in phobos) is faster in the general case.
Mar 13 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 13/03/2012 17:38, Xinok a écrit :
 On Tuesday, 13 March 2012 at 16:04:55 UTC, deadalnix wrote:
 Le 13/03/2012 16:08, Xinok a écrit :
 On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:
 Le 13/03/2012 10:19, Xinok a écrit :
 Would you mind sharing your smoothsort? I haven't implemented one
 myself
 and I'd love to test it out.

It is on github : https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d

Thanks. I found a couple cases where it performs better, but overall, the overhead of the algorithm seems to be too much and most other algorithms performed better.

smooth sort is intended to be used on semi sorted data (like transparent polygons on a 3D scene). Ideal to keep some data sorted. It also have a guarantee to run in O(n*log(n)). But qsort variation (like we have in phobos) is faster in the general case.

It only performs well if there aren't many elements to move around. For example, I took a sorted list with 1 million elements, and appended 64 random elements. Smoothsort was the second slowest, only marginally beating heap sort. My natural merge sort was 27x faster.

Yes, that being said, my implementation use multiple swap where move would have been more appropriate, and don't implement some improvements. Merge sort is also known to be very fast (it is default is some langauges) but trigger memory allocation, something that some cannot afford. Definitively, this is something we should have in phobos.
Mar 13 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/13/12 4:02 AM, Xinok wrote:
 On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
 Hey, I'd love to see more sorting algorithms in phobos. Being stuck
 with one seems kind of... wrong.

Things like this are better left to 3rd party libs. Phobos already has two, a stable and unstable sort, which fulfill 99% of cases.

I think we need a good sort for ranges of ranges (e.g. array of string). Right now sort() does pretty badly on arrays of strings with large common prefixes. Andrei
Mar 13 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/13/12 10:54 AM, Sean Kelly wrote:
 How does the built-in sort do?  I ask because the sort routine I
 wrote works the same way, which is optimized for ranges with a lot of
 common elements.

It's not about common (equal) elements, it's about elements for which comparisons do a lot of work because they have common prefixes. Consider: auto arr = [ "aaa", "aab", "aac", "aad" ]; sort!((a, b) => a > b)(arr); There will be a lot of redundant prefix comparisons because the sorting method doesn't have information about the common prefixes. Trie-based sorting is a more efficient method for ranges of ranges, see e.g. http://en.wikipedia.org/wiki/Burstsort. Andrei
Mar 13 2012
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 it's about elements for which 
 comparisons do a lot of work because they have common prefixes. Consider:
 
 auto arr = [ "aaa", "aab", "aac", "aad" ];
 sort!((a, b) => a > b)(arr);
 
 There will be a lot of redundant prefix comparisons because the sorting 
 method doesn't have information about the common prefixes.

As a benchmark for this new sorting algorithm I suggest to use a poor's man BWT (http://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform ). The purpose here is not to create the most efficient BWT. Bye, bearophile
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
 Hey, I'd love to see more sorting algorithms in phobos.  Being 
 stuck with one seems kind of... wrong.

Things like this are better left to 3rd party libs. Phobos already has two, a stable and unstable sort, which fulfill 99% of cases.
 If the range is a linked list, shouldn't it be possible to do a 
 merge sort with optimally in-place and use no extra memory 
 whatsoever?  I know it can be done in-place, but I've never 
 benchmarked it.  I wonder if it's worth considering, and how it 
 would compare against array-based merge sort with allocations 
 and such.

Yes, it's possible because insertions are inexpensive in linked lists. However, it would be impractical to implement one at the moment because Phobos has no concept of linked lists (ranges wouldn't cover it).
 Although it's probably out of your scope right now, I'd like to 
 see insertion sort some day.  I would use it for things like 
 broadphase sorting in collision detection (that is, you sort 
 everything by say, x coordinates first, and then you can walk 
 through the whole simulation from left-to-right and have very 
 few things to check collisions for at each point).  Since the 
 ordering of the objects in the simulation is unlikely to change 
 much between frames, it will be almost entirely sorted each 
 time.  I have to imagine insertion sort would be awesome at 
 that; nearly O(n).  Maybe if it hits more than log(n) nonlocal 
 insertions it would bail out into a merge sort or something.

Insertion sort is one of the simplest sorts to write. I use it to optimize this stable sort as its super efficient at sorting small sublists. A natural merge sort would also work well in this case, but Timsort would work best. Timsort is also a natural merge sort, but it goes much farther than that.
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 08:37:06 UTC, deadalnix wrote:
 I have a radix sort (that need some rework to be phobos 
 quality) and a smoothsort (that could be included in phobos).

Would you mind sharing your smoothsort? I haven't implemented one myself and I'd love to test it out. Radix sort, on the other hand, is not a comparison sort. You'd have to rewrite it for every possible element and container type.
 I have a sort for ForwardRange, but it is O(n²) and unstable. 
 However, it is in place.

I posted one a few days ago myself - http://forum.dlang.org/thread/cmipnxrarexjgnrdqlvr forum.dlang.org
 I don't think we should allocate behind one's back, so merge 
 sort should be an option, unless called explicitely.

When it comes to stable sorts, merge sort is the best choice. I found tree sort to be quite slow (using RedBlackTree in std.container), and a stable quick sort is still slower than a merge sort. So I guess that'd mean an in-place merge sort. I've implemented one which was 3x slower than quick sort. Allocating even a small amount of space makes a big difference.
 smoothsort is a good solution for that. radix is also guarantee 
 to be O(n). Insertionsort is quite risky, because it can ends 
 up in O(n²) very easily.

Mar 13 2012
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/13/12 1:31 AM, Xinok wrote:
 I've been playing with sorting algorithms a lot in recent months, so I
 want to implement a *working* stable sort for Phobos which is broken at
 the moment. I have a working library and I'm still adding to it. It's
 much more complex than a simple merge sort, being over 300 lines of code
 at the moment.

Working is better than broken.
 - It's a natural merge sort, which is faster on partially sorted lists,
 and adds little overhead for mostly random lists.
 - It uses O(log n log n) additional space for merging.

That's 1024 when n is 4 billion. I think you can safely approximate it with alloca or a fixed-size stack-allocated array.
 - I wrote it to sort random-access ranges *without* slicing, but I think
 the exclusion of slicing makes it slower. I'm writing a separate
 implementation which uses slicing and I'll keep it if it's much faster.

Having random access implies having slicing.
 - To avoid multiple allocations, the user can allocate their own
 temporary memory and pass it to the sort function.

If you need different allocation strategies, I suggest you make it a policy (like stability is).
 - I decided against using pointers. While I could use them to write
 highly optimized code for arrays, pointers can't be used in safe code
 and don't work very well in CTFE at the moment.

Perhaps it's best to have two distinct implementations guarded by if (__ctfe). The runtime implementation can be trusted.
 Is it worth keeping the implementation *without* slicing? Many functions
 in Phobos do require slicing, including the unstable sort, and I think
 most random-access ranges do or could support slicing.

No need.
 What would you prefer in terms of memory usage vs performance? O(n/2)
 space is optimal for performance, O(1) (in-place) requires zero
 allocations but is slower, and O(log n log n) provides a good balance.

The latter rounded up to a constant sounds best.
 Should I implement concurrency? Merge sort is very easy to parallelize,
 and being in-place or natural doesn't change this fact.

Let's save that for std.parallel_algorithm.
 Should we take a different path and go for a basic merge sort or even
 Timsort? I've considered writing a Timsort though that seems like
 daunting task to me, so here's an implementation written in C -
 https://github.com/swenson/sort

I don't know how your function's performance profile compares with timsort's. Andrei
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:
 Le 13/03/2012 10:19, Xinok a écrit :
 Would you mind sharing your smoothsort? I haven't implemented 
 one myself
 and I'd love to test it out.

It is on github : https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d

Thanks. I found a couple cases where it performs better, but overall, the overhead of the algorithm seems to be too much and most other algorithms performed better. While some need to be rewritten, I have a slew of algorithms if you want them for your project.
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 14:31:59 UTC, Andrei Alexandrescu 
wrote:
 On 3/13/12 1:31 AM, Xinok wrote:
 - It's a natural merge sort, which is faster on partially 
 sorted lists,
 and adds little overhead for mostly random lists.
 - It uses O(log n log n) additional space for merging.

That's 1024 when n is 4 billion. I think you can safely approximate it with alloca or a fixed-size stack-allocated array.

How about stack allocated for small lists, and heap allocated for larger lists? e.g. Limit the stack to 1KiB and use the heap for anything larger.
 - I wrote it to sort random-access ranges *without* slicing, 
 but I think
 the exclusion of slicing makes it slower. I'm writing a 
 separate
 implementation which uses slicing and I'll keep it if it's 
 much faster.

Having random access implies having slicing.
 - To avoid multiple allocations, the user can allocate their 
 own
 temporary memory and pass it to the sort function.

If you need different allocation strategies, I suggest you make it a policy (like stability is).
 - I decided against using pointers. While I could use them to 
 write
 highly optimized code for arrays, pointers can't be used in 
 safe code
 and don't work very well in CTFE at the moment.

Perhaps it's best to have two distinct implementations guarded by if (__ctfe). The runtime implementation can be trusted.

If the performance gain is great enough, I'll consider doing that.
 Is it worth keeping the implementation *without* slicing? Many 
 functions
 in Phobos do require slicing, including the unstable sort, and 
 I think
 most random-access ranges do or could support slicing.

No need.

I'll leave it out of Phobos.
 What would you prefer in terms of memory usage vs performance? 
 O(n/2)
 space is optimal for performance, O(1) (in-place) requires zero
 allocations but is slower, and O(log n log n) provides a good 
 balance.

The latter rounded up to a constant sounds best.
 Should I implement concurrency? Merge sort is very easy to 
 parallelize,
 and being in-place or natural doesn't change this fact.

Let's save that for std.parallel_algorithm.

I'll leave it out of Phobos for now.
Mar 13 2012
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
How does the built-in sort do?  I ask because the sort routine I wrote works=
 the same way, which is optimized for ranges with a lot of common elements.=20=


On Mar 13, 2012, at 7:33 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdani.=
org> wrote:

 On 3/13/12 4:02 AM, Xinok wrote:
 On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
 Hey, I'd love to see more sorting algorithms in phobos. Being stuck
 with one seems kind of... wrong.

Things like this are better left to 3rd party libs. Phobos already has two, a stable and unstable sort, which fulfill 99% of cases.

I think we need a good sort for ranges of ranges (e.g. array of string). R=

fixes.
=20
 Andrei
=20

Mar 13 2012
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
I forgot to mention that my routine uses the same basic algorithm as the bui=
lt-in sort.=20

On Mar 13, 2012, at 8:54 AM, Sean Kelly <sean invisibleduck.org> wrote:

 How does the built-in sort do?  I ask because the sort routine I wrote wor=

.=20
=20
 On Mar 13, 2012, at 7:33 AM, Andrei Alexandrescu <SeeWebsiteForEmail erdan=

=20
 On 3/13/12 4:02 AM, Xinok wrote:
 On Tuesday, 13 March 2012 at 06:53:30 UTC, Chad J wrote:
 Hey, I'd love to see more sorting algorithms in phobos. Being stuck
 with one seems kind of... wrong.

Things like this are better left to 3rd party libs. Phobos already has two, a stable and unstable sort, which fulfill 99% of cases.

I think we need a good sort for ranges of ranges (e.g. array of string). R=


fixes.
=20
 Andrei
=20


Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 16:04:55 UTC, deadalnix wrote:
 Le 13/03/2012 16:08, Xinok a écrit :
 On Tuesday, 13 March 2012 at 09:32:49 UTC, deadalnix wrote:
 Le 13/03/2012 10:19, Xinok a écrit :
 Would you mind sharing your smoothsort? I haven't 
 implemented one myself
 and I'd love to test it out.

It is on github : https://github.com/deadalnix/Dsort/blob/master/sort/smooth.d

Thanks. I found a couple cases where it performs better, but overall, the overhead of the algorithm seems to be too much and most other algorithms performed better.

smooth sort is intended to be used on semi sorted data (like transparent polygons on a 3D scene). Ideal to keep some data sorted. It also have a guarantee to run in O(n*log(n)). But qsort variation (like we have in phobos) is faster in the general case.

It only performs well if there aren't many elements to move around. For example, I took a sorted list with 1 million elements, and appended 64 random elements. Smoothsort was the second slowest, only marginally beating heap sort. My natural merge sort was 27x faster.
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 16:11:05 UTC, Andrei Alexandrescu 
wrote:
 On 3/13/12 10:54 AM, Sean Kelly wrote:
 How does the built-in sort do?  I ask because the sort routine 
 I
 wrote works the same way, which is optimized for ranges with a 
 lot of
 common elements.

It's not about common (equal) elements, it's about elements for which comparisons do a lot of work because they have common prefixes. Consider: auto arr = [ "aaa", "aab", "aac", "aad" ]; sort!((a, b) => a > b)(arr); There will be a lot of redundant prefix comparisons because the sorting method doesn't have information about the common prefixes. Trie-based sorting is a more efficient method for ranges of ranges, see e.g. http://en.wikipedia.org/wiki/Burstsort. Andrei

Rather than a sort function, I think we'd benefit more from Trie in std.container. If implemented correctly, it could be self sorting like RedBlackTree.
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 06:32:01 UTC, Xinok wrote:
 I've been playing with sorting algorithms a lot in recent 
 months, so I want to implement a *working* stable sort for 
 Phobos which is broken at the moment. I have a working library 
 and I'm still adding to it. It's much more complex than a 
 simple merge sort, being over 300 lines of code at the moment.

I've implemented slicing which has improved benchmarks quite a bit. Sorting a random array of 1 million uints: Phobos Unstable Sort - 132ms Phobos Stable Sort - 2037ms Proposed Stable Sort - 187ms Sorting a random array of 1 million strings: Phobos Unstable Sort - 1228ms Phobos Stable Sort - 3516ms Proposed Stable Sort - 1178ms It still uses O(log n log n) space. I modified the code to allocate up to 1KiB on the stack, and use the heap for anything larger. I simply marked the entry sort function as trusted. The non-slicing code is still in the lib but disabled. I've yet to add contracts, documentation, and a unittest. I won't be adding optimized code for arrays utilizing pointers as I expect the performance gain to be as little as 10%. You can download the preliminary lib here: http://www.mediafire.com/?ux49x30dj483dqg
Mar 13 2012
prev sibling next sibling parent "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
On Tuesday, 13 March 2012 at 14:31:59 UTC, Andrei Alexandrescu 
wrote:
 On 3/13/12 1:31 AM, Xinok wrote:
 - I wrote it to sort random-access ranges *without* slicing, 
 but I think
 the exclusion of slicing makes it slower. I'm writing a 
 separate
 implementation which uses slicing and I'll keep it if it's 
 much faster.

Having random access implies having slicing.

Currently it can not be assumed that isRandomAccessRange has slicing: http://dlang.org/phobos/std_range.html#isRandomAccessRange Maybe it should be a requirement? It seems to me that Bidirectional ranges can't be infinite, and by extension Random Access ranges too. But slicing could be supported on an infinite range. So hasSlicing is still useful, but I think could be a good requirement on RA ranges.
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Tuesday, 13 March 2012 at 18:32:27 UTC, Xinok wrote:
 On Tuesday, 13 March 2012 at 06:32:01 UTC, Xinok wrote:
 I've been playing with sorting algorithms a lot in recent 
 months, so I want to implement a *working* stable sort for 
 Phobos which is broken at the moment. I have a working library 
 and I'm still adding to it. It's much more complex than a 
 simple merge sort, being over 300 lines of code at the moment.

I've implemented slicing which has improved benchmarks quite a bit. It still uses O(log n log n) space. I modified the code to allocate up to 1KiB on the stack, and use the heap for anything larger. I simply marked the entry sort function as trusted. The non-slicing code is still in the lib but disabled. I've yet to add contracts, documentation, and a unittest. I won't be adding optimized code for arrays utilizing pointers as I expect the performance gain to be as little as 10%.

A second update: http://www.mediafire.com/?gqejl17ob1ywyat I removed the code without slicing from the lib, though I still retain a copy. I added 3 unittests: A basic sort test, a stability test, and a CTFE test. I added a few asserts which have helped me discover bugs in the past. I only added basic documentation. I've found it difficult to explain to others how an in-place merge sort works, so I didn't bother. I've ran the code through several tests. It works, it's stable, and it consistently gives good performance. So for the all important question: Does anybody want to implement it in std.algorithm for me? I've looked at the code in std.algorithm, and all I can tell is that sortImpl is a recursive function. I have no idea what it's doing or what I need to change.
Mar 13 2012
prev sibling next sibling parent "Xinok" <xinok live.com> writes:
On Wednesday, 14 March 2012 at 03:55:31 UTC, Xinok wrote:
 I removed the code without slicing from the lib, though I still 
 retain a copy.
 I added 3 unittests: A basic sort test, a stability test, and a 
 CTFE test.
 I added a few asserts which have helped me discover bugs in the 
 past.
 I only added basic documentation. I've found it difficult to 
 explain to others how an in-place merge sort works, so I didn't 
 bother.

Third update: http://www.mediafire.com/?9jx07estd58wh2p + Added in-place sorting; Set template argument inPlace to true + Fixed CTFE compatibility issues + Vastly improved unittest + CTFE unittest will no longer stop compilation upon failure; It will print a warning instead + Optimization: Recurse into smaller half, use tail call on larger half - CTFE test fails under DMD; Other compilers untested I currently don't know why CTFE fails. The test performed at compile-time is the exact same test that passes at runtime. The code executes successfully but the array is not sorted correctly. By implementing a tail call, in-place sorting should only use O(log n) space on the stack, though at the cost of performance. Sorting a random array of 1 million uints: Phobos Unstable Sort - 132ms Phobos Stable Sort - 2037ms Proposed Stable Sort - 187ms In-Place Stable Sort - 355ms Sorting a random array of 1 million strings: Phobos Unstable Sort - 1228ms Phobos Stable Sort - 3516ms Proposed Stable Sort - 1178ms In-Place Stable Sort - 1422ms Number of Comparisons on 1 million elements: Phobos Unstable Sort - 24813812 Phobos Stable Sort - 25304078 Proposed Stable Sort - 19777088 In-Place Stable Sort - 21212726
Mar 15 2012
prev sibling parent "Xinok" <xinok live.com> writes:
On Thursday, 15 March 2012 at 20:20:59 UTC, Xinok wrote:
 Third update:
 http://www.mediafire.com/?9jx07estd58wh2p

 + Added in-place sorting; Set template argument inPlace to true
 + Fixed CTFE compatibility issues
 + Vastly improved unittest
 + CTFE unittest will no longer stop compilation upon failure; 
 It will print a warning instead
 + Optimization: Recurse into smaller half, use tail call on 
 larger half
 - CTFE test fails under DMD; Other compilers untested

One more update. I created a repository on GitHub where I'll host the module from now on: https://github.com/Xinok/XSort + Added concurrency + Optimized swapBlocks; 10% improvement in benchmarks + Fixed and enhanced unittest + Extended documentation of stableSort function - Concurrency fails to compile in debug builds so is enabled in release builds only - CTFE test still fails I'm still getting acquainted with Git and Phobos, not quite sure what I'm doing. Hopefully I can pull it together soon so Phobos can have a working stable sort.
Mar 24 2012