## digitalmars.D - Should Phobos use Timsort instead? (with a small tweak)

- Xinok <xinok live.com> Nov 07 2011
- bearophile <bearophileHUGS lycos.com> Nov 07 2011
- dsimcha <dsimcha yahoo.com> Nov 07 2011
- Xinok <xinok live.com> Nov 07 2011
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Nov 07 2011
- Xinok <xinok live.com> Nov 07 2011
- Stewart Gordon <smjg_1998 yahoo.com> Dec 29 2012
- "ixid" <nuaccount gmail.com> Dec 29 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> Dec 29 2012
- "ixid" <nuaccount gmail.com> Dec 29 2012
- "Xinok" <xinok live.com> Dec 29 2012

Recently, I've been working on my sorting algorithm, doing what I can before I implemented it into std.algorithm. Recently, I've found myself referring to Timsort for ways to optimize my algorithm, but that made me think, why not use Timsort instead? It was originally designed for Python, but it was apparently good enough for Java to adopt it as well. I think Phobos would benefit most from a good implementation of Timsort, perhaps enough to even ditch the unstable sort which I've found has some bad test cases (try sorting swapRanges(arr[0..$/2], arr[$/2..$])). My algorithm is very similar to Timsort, but Timsort is more highly tuned, so it would probably beat my algorithm in most cases. In a few benchmarks I've seen, Timsort even beats quicksort. The only downside is that Timsort may require up to O(n/2) additional space, and if it fails to allocate enough memory, it simply fails to sort the list. That's the benefit of my algorithm, it allocates O(n/1024) additional space by default and can reduce it further if needed. However, the "range swap" that my algorithm uses could easily be added to Timsort as well, so we could have the best of both worlds: The speed of Timsort with the low memory requirement of my algorithm. This is my proposal: Replace the stable (and unstable?) sort in Phobos with Timsort, including the additional range swap step. An explanation of Timsort can be found here: http://svn.python.org/projects/python/trunk/Objects/listsort.txt My algorithm can be found here (see the blog for more details): https://sourceforge.net/projects/xinoksort/

Nov 07 2011

Xinok:This is my proposal: Replace the stable (and unstable?) sort in Phobos with Timsort,

Replacing the stable Phobos sort with TimSort is a nice idea. Bye, bearophile

Nov 07 2011

Phobos definitely needs a decent stable sort. I had been meaning to contribute a very highly tuned merge sort that I use for some statistical calculations after allocators get into Phobos. Timsort would make a good additional option. However, I **strongly** recommend against **replacing** a sort that does not require O(n) extra space with one that does. On 11/7/2011 3:32 PM, Xinok wrote:Recently, I've been working on my sorting algorithm, doing what I can before I implemented it into std.algorithm. Recently, I've found myself referring to Timsort for ways to optimize my algorithm, but that made me think, why not use Timsort instead? It was originally designed for Python, but it was apparently good enough for Java to adopt it as well. I think Phobos would benefit most from a good implementation of Timsort, perhaps enough to even ditch the unstable sort which I've found has some bad test cases (try sorting swapRanges(arr[0..$/2], arr[$/2..$])). My algorithm is very similar to Timsort, but Timsort is more highly tuned, so it would probably beat my algorithm in most cases. In a few benchmarks I've seen, Timsort even beats quicksort. The only downside is that Timsort may require up to O(n/2) additional space, and if it fails to allocate enough memory, it simply fails to sort the list. That's the benefit of my algorithm, it allocates O(n/1024) additional space by default and can reduce it further if needed. However, the "range swap" that my algorithm uses could easily be added to Timsort as well, so we could have the best of both worlds: The speed of Timsort with the low memory requirement of my algorithm. This is my proposal: Replace the stable (and unstable?) sort in Phobos with Timsort, including the additional range swap step. An explanation of Timsort can be found here: http://svn.python.org/projects/python/trunk/Objects/listsort.txt My algorithm can be found here (see the blog for more details): https://sourceforge.net/projects/xinoksort/

Nov 07 2011

On 11/7/2011 7:47 PM, dsimcha wrote:Phobos definitely needs a decent stable sort. I had been meaning to contribute a very highly tuned merge sort that I use for some statistical calculations after allocators get into Phobos. Timsort would make a good additional option. However, I **strongly** recommend against **replacing** a sort that does not require O(n) extra space with one that does.

Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required. Timsort would still use up to O(n/2) space, but if it failed to allocate enough memory, it could resort to the range swap instead. I'll leave it up to the community to decide what's best. But if the stable sort proves to be faster, has a worst case of O(n log n), and can succeed despite failing to allocate additional space, what purpose is there in keeping the unstable sort?

Nov 07 2011

On 11/7/11 9:28 PM, Xinok wrote:On 11/7/2011 7:47 PM, dsimcha wrote:Phobos definitely needs a decent stable sort. I had been meaning to contribute a very highly tuned merge sort that I use for some statistical calculations after allocators get into Phobos. Timsort would make a good additional option. However, I **strongly** recommend against **replacing** a sort that does not require O(n) extra space with one that does.

Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required. Timsort would still use up to O(n/2) space, but if it failed to allocate enough memory, it could resort to the range swap instead. I'll leave it up to the community to decide what's best. But if the stable sort proves to be faster, has a worst case of O(n log n), and can succeed despite failing to allocate additional space, what purpose is there in keeping the unstable sort?

Would be great to provide a better implementation of both stable and unstable sort. The part about requiring extra memory may be problematic, but we may use e.g. an additional policy parameter to decide on that. One thing I find confusing about the "range swap" operation that you rely on and mention in http://cl.ly/3L1o111u1M232q061o2N (and in the description of xinok sort which I skimmed without being able to understand) is that the example does not reflect the generality alluded to in the text. For example, the circled ranges at the top satisfy three conditions: 1. They are adjacent 2. Each is sorted 3. Each element in the left range is greater than any element in the right range 4. They have the same size (four elements each) It is unclear how many of these conditions must be _actually_ satisfied for range swap to work. Are all needed? The use of "half" and "merged" is also quite casual. I wish a clearer description were available. At any rate, if a provably better algorithm than what's in Phobos is available, we should include it in Phobos. I'm glad people find the necessity of a faster stable sort algorithm - it reflects maturity and sophistication in the community. Andrei

Nov 07 2011

On 11/7/2011 10:44 PM, Andrei Alexandrescu wrote:On 11/7/11 9:28 PM, Xinok wrote:On 11/7/2011 7:47 PM, dsimcha wrote:Phobos definitely needs a decent stable sort. I had been meaning to contribute a very highly tuned merge sort that I use for some statistical calculations after allocators get into Phobos. Timsort would make a good additional option. However, I **strongly** recommend against **replacing** a sort that does not require O(n) extra space with one that does.

Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required. Timsort would still use up to O(n/2) space, but if it failed to allocate enough memory, it could resort to the range swap instead. I'll leave it up to the community to decide what's best. But if the stable sort proves to be faster, has a worst case of O(n log n), and can succeed despite failing to allocate additional space, what purpose is there in keeping the unstable sort?

Would be great to provide a better implementation of both stable and unstable sort. The part about requiring extra memory may be problematic, but we may use e.g. an additional policy parameter to decide on that. One thing I find confusing about the "range swap" operation that you rely on and mention in http://cl.ly/3L1o111u1M232q061o2N (and in the description of xinok sort which I skimmed without being able to understand) is that the example does not reflect the generality alluded to in the text. For example, the circled ranges at the top satisfy three conditions: 1. They are adjacent 2. Each is sorted 3. Each element in the left range is greater than any element in the right range 4. They have the same size (four elements each) It is unclear how many of these conditions must be _actually_ satisfied for range swap to work. Are all needed? The use of "half" and "merged" is also quite casual. I wish a clearer description were available. At any rate, if a provably better algorithm than what's in Phobos is available, we should include it in Phobos. I'm glad people find the necessity of a faster stable sort algorithm - it reflects maturity and sophistication in the community. Andrei

1. Yes. As mentioned in the graphic, the key is to move the smallest values to the left half, and the largest values to the right half. Since in a sorted list, the smallest values are at the beginning, and the largest values are at the end, the two ranges to swap will always be adjacent. 2. Yes. As you know, merge sort works by recursively merging two sorted lists into one sorted list. Merge sort generally requires O(n) space, or O(n/2) space if you only make a copy of the smaller half. The graphic intends to show how range swap reduces the space requirement. Instead of doing one large merge, you do two smaller merges. 3. Yes, see #1. 4. Yes. The "range swap" literally swaps two ranges of elements, so they must be equal in length. If you're still confused, think of it like this: In quick sort, you choose a pivot value, then you move all smaller values to the left of the pivot, and all larger values to the right of the pivot. Once you do this, there's no need to move any value in the left half to the right half, and vice versa. Similarly, range swap moves all the smallest values to the left half, and all the largest values to the right half, so there's no need to move values between the two halves.

Nov 07 2011

On 08/11/2011 03:28, Xinok wrote: <snip>Take a look at the graphic here: http://cl.ly/3L1o111u1M232q061o2N

404That's the extra step that I propose adding to Timsort, if it were implemented in Phobos. By doing a single range swap, you can reduce the space requirement from O(n/2) to O(n/4). My algorithm utilizes it so that only O(n/1024) additional space is required.

Uh, O(n/2), O(n/4) and O(n/1024) are all the same thing. Stewart.

Dec 29 2012

On Monday, 7 November 2011 at 20:33:19 UTC, Xinok wrote:Recently, I've been working on my sorting algorithm, doing what I can before I implemented it into std.algorithm. Recently, I've found myself referring to Timsort for ways to optimize my algorithm, but that made me think, why not use Timsort instead? It was originally designed for Python, but it was apparently good enough for Java to adopt it as well. I think Phobos would benefit most from a good implementation of Timsort, perhaps enough to even ditch the unstable sort which I've found has some bad test cases (try sorting swapRanges(arr[0..$/2], arr[$/2..$])). My algorithm is very similar to Timsort, but Timsort is more highly tuned, so it would probably beat my algorithm in most cases. In a few benchmarks I've seen, Timsort even beats quicksort. The only downside is that Timsort may require up to O(n/2) additional space, and if it fails to allocate enough memory, it simply fails to sort the list. That's the benefit of my algorithm, it allocates O(n/1024) additional space by default and can reduce it further if needed. However, the "range swap" that my algorithm uses could easily be added to Timsort as well, so we could have the best of both worlds: The speed of Timsort with the low memory requirement of my algorithm. This is my proposal: Replace the stable (and unstable?) sort in Phobos with Timsort, including the additional range swap step. An explanation of Timsort can be found here: http://svn.python.org/projects/python/trunk/Objects/listsort.txt My algorithm can be found here (see the blog for more details): https://sourceforge.net/projects/xinoksort/

This sounds good, you should also contact the forum user Philippe Sigaud to see if he's made any progress with his templated sorting network idea for small number of items and combine the two to provide very effective sorting.

Dec 29 2012

12/29/2012 10:49 PM, ixid пишет:On Monday, 7 November 2011 at 20:33:19 UTC, Xinok wrote:Recently, I've been working on my sorting algorithm, doing what I can before I implemented it into std.algorithm. Recently, I've found myself referring to Timsort for ways to optimize my algorithm, but that made me think, why not use Timsort instead? It was originally designed for Python, but it was apparently good enough for Java to adopt it as well.

[snip]An explanation of Timsort can be found here: http://svn.python.org/projects/python/trunk/Objects/listsort.txt My algorithm can be found here (see the blog for more details): https://sourceforge.net/projects/xinoksort/

This sounds good, you should also contact the forum user Philippe Sigaud to see if he's made any progress with his templated sorting network idea for small number of items and combine the two to provide very effective sorting.

Let me point out that Phobos has got the Timsort for stable sort in 2.061. It's precisely the work of Xinok that was integrated after going through many rounds of review. It would great to analyze the extra trick that reduces the amount of memory required. If it's proven to be a good speed-size trade-off then it could be integrated. What's would be truly awesome at the moment is highly efficient version(s) of sort(s) customized for small ranges. And IRC that's what Philippe's sorting networks were good at. -- Dmitry Olshansky

Dec 29 2012

What's would be truly awesome at the moment is highly efficient version(s) of sort(s) customized for small ranges. And IRC that's what Philippe's sorting networks were good at.

Wasn't that what I was saying? =) It seems like an ideal application of D's strengths to be easily able to write something like sort!"timsort" or sort!"mergesort" (or whatever formatting) in addition to intelligent analysis of the number of items and select a sorting network if that would be favourable.

Dec 29 2012

On Saturday, 29 December 2012 at 19:07:52 UTC, Dmitry Olshansky wrote:Let me point out that Phobos has got the Timsort for stable sort in 2.061. It's precisely the work of Xinok that was integrated after going through many rounds of review. It would great to analyze the extra trick that reduces the amount of memory required. If it's proven to be a good speed-size trade-off then it could be integrated.

I suppose it's worth a try. The trick in question takes two large runs and reduces them into several smaller runs for merging. The problem I foresee is that this may negatively affect the galloping mode of Timsort, which is most effective when merging two large runs. But I'll add this as a feature to my Timsort module anyways and see what effect it has.

Dec 29 2012