## digitalmars.D - Stable partition3 implementation

- Xinok (13/13) Jul 09 2015 I wanted to get some feedback on a stable partition3
- Xinok (4/7) Jul 09 2015 I apologize, I didn't realize this link was behind a paywall. The
- Nick B (40/45) Jul 09 2015 from the pdf, above, in case readers, like me, don't know the
- Timon Gehr (3/10) Jul 10 2015 Note how /they/ don't mention "complexity". This is because algorithms
- Timon Gehr (8/15) Jul 10 2015 I think this method is likely to be less practically relevant than the
- Xinok (6/15) Jul 10 2015 log* n grows fast for small inputs, so we can't simply ignore it.
- Timon Gehr (20/36) Jul 10 2015 No, it does not. It has no space to grow. With base e, it is 3 or 4 for
- Xinok (15/37) Jul 22 2015 I understand now. I had never heard of an iterated logarithm
- John Colvin (4/21) Jul 22 2015 I haven't looked at the paper, but # followed by a set often
- Andrei Alexandrescu (4/15) Jul 10 2015 I'd say both would be nice (not to mention the one in the paper) so how
- Xinok (20/27) Jul 10 2015 I'm hesitant to add yet another template parameter; IMHO, it's
- Andrei Alexandrescu (2/5) Jul 10 2015 Then give them different names. -- Andrei

I wanted to get some feedback on a stable partition3 implementation for Phobos before I work on a pull request. I wrote this implementation which is in-place (zero memory allocations) but runs in O(n lg n) time. http://dpaste.dzfl.pl/e12f50ad947d I found this paper which describes an in-place algorithm with O(n) time complexity but it's over my head at the moment. http://link.springer.com/article/10.1007%2FBF01994842 It is trivial to implement an algorithm with O(n) time complexity and O(n) space complexity, but that requires allocating memory. Furthermore, using a buffer requires the range to have assignable elements. Any thoughts?

Jul 09 2015

On Thursday, 9 July 2015 at 21:57:39 UTC, Xinok wrote:I found this paper which describes an in-place algorithm with O(n) time complexity but it's over my head at the moment. http://link.springer.com/article/10.1007%2FBF01994842I apologize, I didn't realize this link was behind a paywall. The paper is freely available here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.5554&rep=rep1&type=pdf

Jul 09 2015

On Friday, 10 July 2015 at 00:39:16 UTC, Xinok wrote:On Thursday, 9 July 2015 at 21:57:39 UTC, Xinok wrote:[snip]I found this paper which describes an in-place algorithm with O(n) time complexity but it's over my head at the moment.http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.5554&rep=rep1&type=pdffrom the pdf, above, in case readers, like me, don't know the context of a Stable Partition implementation: Stable minimum space partitioning in linear time Jyrki Katajainen1 and Tomi Pasanen2 Abstract. In the stable 0-1 sorting problem the task is to sort an array of n elements with two distinct values such that equal elements retain their relative input order. Recently, Munro, Raman and Salowe [BIT 1990] gave an algorithm which solves this problem in O(nlog*n) 3 time and constant extra space. We show that by a modification of their method the stable 0-1 sorting is possible in O(n) time and O(1) extra space. Stable three-way partitioning can be reduced to stable 0-1 sorting. This immediately yields a stable minimum space quicksort, which sorts multisets in asymptotically optimal time with high probability. CR categories: E.5, F.2.2. The stable 0-1 sorting problem is defined as follows: Given an array of n elements and a function f mapping each element to the set {0,1}, the task is to rearrange the elements such that all elements, whose f-value is zero, become before elements, whose f-value is one. Moreover, the relative order of elements with equal f-values should be maintained. For the sake of simplicity, we hereafter refer to bits instead of the f-values of elements. Stable partitioning is a special case of stable 0-1 sorting, where the f-values are obtained by comparing every element xi to some pivot element xj (which will not take part in partitioning):

Jul 09 2015

On 07/10/2015 03:48 AM, Nick B wrote:On Friday, 10 July 2015 at 00:39:16 UTC, Xinok wrote:Note how /they/ don't mention "complexity". This is because algorithms don't have "complexities". Problems do. (Sorry, pet peeve of mine.)On Thursday, 9 July 2015 at 21:57:39 UTC, Xinok wrote:... stable 0-1 sorting is possible in O(n) time and O(1) extra space.I found this paper which describes an in-place algorithm with O(n) time complexity but it's over my head at the moment.

Jul 10 2015

On 07/09/2015 11:57 PM, Xinok wrote:I found this paper which describes an in-place algorithm with O(n) time complexity but it's over my head at the moment. http://link.springer.com/article/10.1007%2FBF01994842 It is trivial to implement an algorithm with O(n) time complexity and O(n) space complexity, but that requires allocating memory. Furthermore, using a buffer requires the range to have assignable elements. Any thoughts?I think this method is likely to be less practically relevant than the one they improve upon. (log* n really is constant for all practical purposes, it is the number of times one needs to iteratively take the logarithm until a number below 1 is obtained.) That paper appears to be available here: https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf No idea what the constant is though, I have not read the paper (yet).

Jul 10 2015

On Friday, 10 July 2015 at 21:26:50 UTC, Timon Gehr wrote:I think this method is likely to be less practically relevant than the one they improve upon. (log* n really is constant for all practical purposes, it is the number of times one needs to iteratively take the logarithm until a number below 1 is obtained.)log* n grows fast for small inputs, so we can't simply ignore it. For example, if n = 2^16 = 65536, then n*lg(n) = 16*2^16 = 2^20 = 1048576.That paper appears to be available here: https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf No idea what the constant is though, I have not read the paper (yet).I don't know if there is anything relevant but that paper focuses on a different problem involving sorting.

Jul 10 2015

On 07/11/2015 02:32 AM, Xinok wrote:On Friday, 10 July 2015 at 21:26:50 UTC, Timon Gehr wrote:No, it does not. It has no space to grow. With base e, it is 3 or 4 for all input sizes that matter: http://www.wolframalpha.com/input/?i=plot+IteratedLog+from+0+to+2^8 http://www.wolframalpha.com/input/?i=plot+IteratedLog+from+0+to+2^32 http://www.wolframalpha.com/input/?i=plot+IteratedLog+from+0+to+2^64 http://www.wolframalpha.com/input/?i=IteratedLog%282^2^2^2^2%29I think this method is likely to be less practically relevant than the one they improve upon. (log* n really is constant for all practical purposes, it is the number of times one needs to iteratively take the logarithm until a number below 1 is obtained.)log* n grows fast for small inputs,so we can't simply ignore it.There is a priori no practical difference between a O(n) algorithm and a O(n*log*(n)) algorithm. It all depends on the constants, and it is hence not clear-cut that the O(n) algorithm will run faster. I'm just saying that this should be taken into consideration.For example, if n = 2^16 = 65536, then n*lg(n) = 16*2^16 = 2^20 = 1048576. ...The algorithm runs in O(n*log*(n)). It's not n log(n).No, it focuses on a few closely related problems, and the O(n*log*(n))-algorithms solves a problem that is a straightforward generalization of the problem you are looking at. Stable three way partition is stable sorting where there are only three "distinct" values (smaller, equal, greater). The paper you provided builds on top of this algorithm. It is the main reference and part (ii) of the "Algorithm B" part of the O(n)/O(1) algorithm does not occur in the paper you provided, but only in that other paper, so yes, it is relevant. :-)That paper appears to be available here: https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf No idea what the constant is though, I have not read the paper (yet).I don't know if there is anything relevant but that paper focuses on a different problem involving sorting.

Jul 10 2015

On Saturday, 11 July 2015 at 02:56:55 UTC, Timon Gehr wrote:... The algorithm runs in O(n*log*(n)). It's not n log(n). ...I understand now. I had never heard of an iterated logarithm before. I took the asterisk to mean some constant, like a wild card if you will. Sorry for the confusion. I wonder if the true O(n) algorithm is still worth pursuing. Although log*(n) may only be a factor of 2 or 3, the O(n) algorithm may have a small constant factor which makes it 2-3x faster.I get that much now, but this paper is still way above my head. There's some notation in the sample code that I don't understand. In this statement: e <- #{ j : L[j] = x, 1 <= j <= n } I'm not sure what the pound # signifies or what exactly is being assigned to e. I'm assuming it performs some operation on the set, like "max", but I'm not sure what.No, it focuses on a few closely related problems, and the O(n*log*(n))-algorithms solves a problem that is a straightforward generalization of the problem you are looking at. Stable three way partition is stable sorting where there are only three "distinct" values (smaller, equal, greater). The paper you provided builds on top of this algorithm. It is the main reference and part (ii) of the "Algorithm B" part of the O(n)/O(1) algorithm does not occur in the paper you provided, but only in that other paper, so yes, it is relevant. :-)That paper appears to be available here: https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf No idea what the constant is though, I have not read the paper (yet).I don't know if there is anything relevant but that paper focuses on a different problem involving sorting.

Jul 22 2015

On Wednesday, 22 July 2015 at 16:59:29 UTC, Xinok wrote:On Saturday, 11 July 2015 at 02:56:55 UTC, Timon Gehr wrote:I haven't looked at the paper, but # followed by a set often means cardinality, i.e. (assuming the set is is finite) the number of elements in the set.[...]I understand now. I had never heard of an iterated logarithm before. I took the asterisk to mean some constant, like a wild card if you will. Sorry for the confusion. I wonder if the true O(n) algorithm is still worth pursuing. Although log*(n) may only be a factor of 2 or 3, the O(n) algorithm may have a small constant factor which makes it 2-3x faster.[...]I get that much now, but this paper is still way above my head. There's some notation in the sample code that I don't understand. In this statement: e <- #{ j : L[j] = x, 1 <= j <= n } I'm not sure what the pound # signifies or what exactly is being assigned to e. I'm assuming it performs some operation on the set, like "max", but I'm not sure what.

Jul 22 2015

On 7/9/15 5:57 PM, Xinok wrote:I wanted to get some feedback on a stable partition3 implementation for Phobos before I work on a pull request. I wrote this implementation which is in-place (zero memory allocations) but runs in O(n lg n) time. http://dpaste.dzfl.pl/e12f50ad947d I found this paper which describes an in-place algorithm with O(n) time complexity but it's over my head at the moment. http://link.springer.com/article/10.1007%2FBF01994842 It is trivial to implement an algorithm with O(n) time complexity and O(n) space complexity, but that requires allocating memory. Furthermore, using a buffer requires the range to have assignable elements. Any thoughts?I'd say both would be nice (not to mention the one in the paper) so how about both are present selectable with a policy ala Yes.tightMemory or something. -- Andrei

Jul 10 2015

On Saturday, 11 July 2015 at 00:00:47 UTC, Andrei Alexandrescu wrote:On 7/9/15 5:57 PM, Xinok wrote:I'm hesitant to add yet another template parameter; IMHO, it's bad enough that we have to manually write the predicate just to reach the swap strategy. There's a fourth option I didn't think to mention. It's easy to utilize a small buffer which would be used to partition small sublists instead of insertions. partition3 would not allocate it's own memory so would be in-place by default, but the user could provide their own buffer and pass it as an extra function argument. For example: int[] arr = iota(0, 100000).array; int[] buf = new int[100]; partition3!(...)(arr, 1000, buf); Interestingly, for some constant k, if you implement this algorithm to use O(n/k) space, then it runs in O(n lg n) time because it performs at most O(lg k) rotations. Regarding the algorithm in the paper, I don't have it completely figured out yet because it refers to algorithms in other papers which I can't find.... Any thoughts?I'd say both would be nice (not to mention the one in the paper) so how about both are present selectable with a policy ala Yes.tightMemory or something. -- Andrei

Jul 10 2015

On 7/10/15 8:55 PM, Xinok wrote:I'm hesitant to add yet another template parameter; IMHO, it's bad enough that we have to manually write the predicate just to reach the swap strategy.Then give them different names. -- Andrei

Jul 10 2015