www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Should Phobos use Timsort instead? (with a small tweak)

reply Xinok <xinok live.com> writes:
Recently, I've been working on my sorting algorithm, doing what I can 
before I implemented it into std.algorithm. Recently, I've found myself 
referring to Timsort for ways to optimize my algorithm, but that made me 
think, why not use Timsort instead? It was originally designed for 
Python, but it was apparently good enough for Java to adopt it as well.

I think Phobos would benefit most from a good implementation of Timsort, 
perhaps enough to even ditch the unstable sort which I've found has some 
bad test cases (try sorting swapRanges(arr[0..$/2], arr[$/2..$])). My 
algorithm is very similar to Timsort, but Timsort is more highly tuned, 
so it would probably beat my algorithm in most cases. In a few 
benchmarks I've seen, Timsort even beats quicksort.

The only downside is that Timsort may require up to O(n/2) additional 
space, and if it fails to allocate enough memory, it simply fails to 
sort the list. That's the benefit of my algorithm, it allocates 
O(n/1024) additional space by default and can reduce it further if 
needed. However, the "range swap" that my algorithm uses could easily be 
added to Timsort as well, so we could have the best of both worlds: The 
speed of Timsort with the low memory requirement of my algorithm.

This is my proposal: Replace the stable (and unstable?) sort in Phobos 
with Timsort, including the additional range swap step.

An explanation of Timsort can be found here:
http://svn.python.org/projects/python/trunk/Objects/listsort.txt

My algorithm can be found here (see the blog for more details):
https://sourceforge.net/projects/xinoksort/
Nov 07 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Xinok:

 This is my proposal: Replace the stable (and unstable?) sort in Phobos 
 with Timsort,

Replacing the stable Phobos sort with TimSort is a nice idea. Bye, bearophile
Nov 07 2011
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
Phobos definitely needs a decent stable sort.  I had been meaning to 
contribute a very highly tuned merge sort that I use for some 
statistical calculations after allocators get into Phobos.  Timsort 
would make a good additional option.  However, I **strongly** recommend 
against **replacing** a sort that does not require O(n) extra space with 
one that does.

On 11/7/2011 3:32 PM, Xinok wrote:
 Recently, I've been working on my sorting algorithm, doing what I can
 before I implemented it into std.algorithm. Recently, I've found myself
 referring to Timsort for ways to optimize my algorithm, but that made me
 think, why not use Timsort instead? It was originally designed for
 Python, but it was apparently good enough for Java to adopt it as well.

 I think Phobos would benefit most from a good implementation of Timsort,
 perhaps enough to even ditch the unstable sort which I've found has some
 bad test cases (try sorting swapRanges(arr[0..$/2], arr[$/2..$])). My
 algorithm is very similar to Timsort, but Timsort is more highly tuned,
 so it would probably beat my algorithm in most cases. In a few
 benchmarks I've seen, Timsort even beats quicksort.

 The only downside is that Timsort may require up to O(n/2) additional
 space, and if it fails to allocate enough memory, it simply fails to
 sort the list. That's the benefit of my algorithm, it allocates
 O(n/1024) additional space by default and can reduce it further if
 needed. However, the "range swap" that my algorithm uses could easily be
 added to Timsort as well, so we could have the best of both worlds: The
 speed of Timsort with the low memory requirement of my algorithm.

 This is my proposal: Replace the stable (and unstable?) sort in Phobos
 with Timsort, including the additional range swap step.

 An explanation of Timsort can be found here:
 http://svn.python.org/projects/python/trunk/Objects/listsort.txt

 My algorithm can be found here (see the blog for more details):
 https://sourceforge.net/projects/xinoksort/

Nov 07 2011
parent reply Xinok <xinok live.com> writes:
On 11/7/2011 7:47 PM, dsimcha wrote:
 Phobos definitely needs a decent stable sort. I had been meaning to
 contribute a very highly tuned merge sort that I use for some
 statistical calculations after allocators get into Phobos. Timsort would
 make a good additional option. However, I **strongly** recommend against
 **replacing** a sort that does not require O(n) extra space with one
 that does.

Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required. Timsort would still use up to O(n/2) space, but if it failed to allocate enough memory, it could resort to the range swap instead. I'll leave it up to the community to decide what's best. But if the stable sort proves to be faster, has a worst case of O(n log n), and can succeed despite failing to allocate additional space, what purpose is there in keeping the unstable sort?
Nov 07 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/7/11 9:28 PM, Xinok wrote:
 On 11/7/2011 7:47 PM, dsimcha wrote:
 Phobos definitely needs a decent stable sort. I had been meaning to
 contribute a very highly tuned merge sort that I use for some
 statistical calculations after allocators get into Phobos. Timsort would
 make a good additional option. However, I **strongly** recommend against
 **replacing** a sort that does not require O(n) extra space with one
 that does.

Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required. Timsort would still use up to O(n/2) space, but if it failed to allocate enough memory, it could resort to the range swap instead. I'll leave it up to the community to decide what's best. But if the stable sort proves to be faster, has a worst case of O(n log n), and can succeed despite failing to allocate additional space, what purpose is there in keeping the unstable sort?

Would be great to provide a better implementation of both stable and unstable sort. The part about requiring extra memory may be problematic, but we may use e.g. an additional policy parameter to decide on that. One thing I find confusing about the "range swap" operation that you rely on and mention in http://cl.ly/3L1o111u1M232q061o2N (and in the description of xinok sort which I skimmed without being able to understand) is that the example does not reflect the generality alluded to in the text. For example, the circled ranges at the top satisfy three conditions: 1. They are adjacent 2. Each is sorted 3. Each element in the left range is greater than any element in the right range 4. They have the same size (four elements each) It is unclear how many of these conditions must be _actually_ satisfied for range swap to work. Are all needed? The use of "half" and "merged" is also quite casual. I wish a clearer description were available. At any rate, if a provably better algorithm than what's in Phobos is available, we should include it in Phobos. I'm glad people find the necessity of a faster stable sort algorithm - it reflects maturity and sophistication in the community. Andrei
Nov 07 2011
parent Xinok <xinok live.com> writes:
On 11/7/2011 10:44 PM, Andrei Alexandrescu wrote:
 On 11/7/11 9:28 PM, Xinok wrote:
 On 11/7/2011 7:47 PM, dsimcha wrote:
 Phobos definitely needs a decent stable sort. I had been meaning to
 contribute a very highly tuned merge sort that I use for some
 statistical calculations after allocators get into Phobos. Timsort would
 make a good additional option. However, I **strongly** recommend against
 **replacing** a sort that does not require O(n) extra space with one
 that does.

Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required. Timsort would still use up to O(n/2) space, but if it failed to allocate enough memory, it could resort to the range swap instead. I'll leave it up to the community to decide what's best. But if the stable sort proves to be faster, has a worst case of O(n log n), and can succeed despite failing to allocate additional space, what purpose is there in keeping the unstable sort?

Would be great to provide a better implementation of both stable and unstable sort. The part about requiring extra memory may be problematic, but we may use e.g. an additional policy parameter to decide on that. One thing I find confusing about the "range swap" operation that you rely on and mention in http://cl.ly/3L1o111u1M232q061o2N (and in the description of xinok sort which I skimmed without being able to understand) is that the example does not reflect the generality alluded to in the text. For example, the circled ranges at the top satisfy three conditions: 1. They are adjacent 2. Each is sorted 3. Each element in the left range is greater than any element in the right range 4. They have the same size (four elements each) It is unclear how many of these conditions must be _actually_ satisfied for range swap to work. Are all needed? The use of "half" and "merged" is also quite casual. I wish a clearer description were available. At any rate, if a provably better algorithm than what's in Phobos is available, we should include it in Phobos. I'm glad people find the necessity of a faster stable sort algorithm - it reflects maturity and sophistication in the community. Andrei

1. Yes. As mentioned in the graphic, the key is to move the smallest values to the left half, and the largest values to the right half. Since in a sorted list, the smallest values are at the beginning, and the largest values are at the end, the two ranges to swap will always be adjacent. 2. Yes. As you know, merge sort works by recursively merging two sorted lists into one sorted list. Merge sort generally requires O(n) space, or O(n/2) space if you only make a copy of the smaller half. The graphic intends to show how range swap reduces the space requirement. Instead of doing one large merge, you do two smaller merges. 3. Yes, see #1. 4. Yes. The "range swap" literally swaps two ranges of elements, so they must be equal in length. If you're still confused, think of it like this: In quick sort, you choose a pivot value, then you move all smaller values to the left of the pivot, and all larger values to the right of the pivot. Once you do this, there's no need to move any value in the left half to the right half, and vice versa. Similarly, range swap moves all the smallest values to the left half, and all the largest values to the right half, so there's no need to move values between the two halves.
Nov 07 2011
prev sibling parent Stewart Gordon <smjg_1998 yahoo.com> writes:
On 08/11/2011 03:28, Xinok wrote:
<snip>
 Take a look at the graphic here:
 http://cl.ly/3L1o111u1M232q061o2N

404
 That's the extra step that I propose adding to Timsort, if it were
 implemented in Phobos. By doing a single range swap, you can reduce the
 space requirement from O(n/2) to O(n/4). My algorithm utilizes it so
 that only O(n/1024) additional space is required.

Uh, O(n/2), O(n/4) and O(n/1024) are all the same thing. Stewart.
Dec 29 2012
prev sibling next sibling parent reply "ixid" <nuaccount gmail.com> writes:
On Monday, 7 November 2011 at 20:33:19 UTC, Xinok wrote:
 Recently, I've been working on my sorting algorithm, doing what 
 I can before I implemented it into std.algorithm. Recently, 
 I've found myself referring to Timsort for ways to optimize my 
 algorithm, but that made me think, why not use Timsort instead? 
 It was originally designed for Python, but it was apparently 
 good enough for Java to adopt it as well.

 I think Phobos would benefit most from a good implementation of 
 Timsort, perhaps enough to even ditch the unstable sort which 
 I've found has some bad test cases (try sorting 
 swapRanges(arr[0..$/2], arr[$/2..$])). My algorithm is very 
 similar to Timsort, but Timsort is more highly tuned, so it 
 would probably beat my algorithm in most cases. In a few 
 benchmarks I've seen, Timsort even beats quicksort.

 The only downside is that Timsort may require up to O(n/2) 
 additional space, and if it fails to allocate enough memory, it 
 simply fails to sort the list. That's the benefit of my 
 algorithm, it allocates O(n/1024) additional space by default 
 and can reduce it further if needed. However, the "range swap" 
 that my algorithm uses could easily be added to Timsort as 
 well, so we could have the best of both worlds: The speed of 
 Timsort with the low memory requirement of my algorithm.

 This is my proposal: Replace the stable (and unstable?) sort in 
 Phobos with Timsort, including the additional range swap step.

 An explanation of Timsort can be found here:
 http://svn.python.org/projects/python/trunk/Objects/listsort.txt

 My algorithm can be found here (see the blog for more details):
 https://sourceforge.net/projects/xinoksort/

This sounds good, you should also contact the forum user Philippe Sigaud to see if he's made any progress with his templated sorting network idea for small number of items and combine the two to provide very effective sorting.
Dec 29 2012
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12/29/2012 10:49 PM, ixid пишет:
 On Monday, 7 November 2011 at 20:33:19 UTC, Xinok wrote:
 Recently, I've been working on my sorting algorithm, doing what I can
 before I implemented it into std.algorithm. Recently, I've found
 myself referring to Timsort for ways to optimize my algorithm, but
 that made me think, why not use Timsort instead? It was originally
 designed for Python, but it was apparently good enough for Java to
 adopt it as well.


[snip]
 An explanation of Timsort can be found here:
 http://svn.python.org/projects/python/trunk/Objects/listsort.txt

 My algorithm can be found here (see the blog for more details):
 https://sourceforge.net/projects/xinoksort/

This sounds good, you should also contact the forum user Philippe Sigaud to see if he's made any progress with his templated sorting network idea for small number of items and combine the two to provide very effective sorting.

Let me point out that Phobos has got the Timsort for stable sort in 2.061. It's precisely the work of Xinok that was integrated after going through many rounds of review. It would great to analyze the extra trick that reduces the amount of memory required. If it's proven to be a good speed-size trade-off then it could be integrated. What's would be truly awesome at the moment is highly efficient version(s) of sort(s) customized for small ranges. And IRC that's what Philippe's sorting networks were good at. -- Dmitry Olshansky
Dec 29 2012
prev sibling next sibling parent "ixid" <nuaccount gmail.com> writes:
 What's would be truly awesome at the moment is highly efficient 
 version(s) of sort(s) customized for small ranges. And IRC 
 that's what Philippe's sorting networks were good at.

Wasn't that what I was saying? =) It seems like an ideal application of D's strengths to be easily able to write something like sort!"timsort" or sort!"mergesort" (or whatever formatting) in addition to intelligent analysis of the number of items and select a sorting network if that would be favourable.
Dec 29 2012
prev sibling parent "Xinok" <xinok live.com> writes:
On Saturday, 29 December 2012 at 19:07:52 UTC, Dmitry Olshansky 
wrote:
 Let me point out that Phobos has got the Timsort for stable 
 sort in 2.061. It's precisely the work of Xinok that was 
 integrated after going through many rounds of review.

 It would great to analyze the extra trick that reduces the 
 amount of memory required. If it's proven to be a good 
 speed-size trade-off then it could be integrated.

I suppose it's worth a try. The trick in question takes two large runs and reduces them into several smaller runs for merging. The problem I foresee is that this may negatively affect the galloping mode of Timsort, which is most effective when merging two large runs. But I'll add this as a feature to my Timsort module anyways and see what effect it has.
Dec 29 2012