www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Stable partition3 implementation

reply "Xinok" <xinok live.com> writes:
I wanted to get some feedback on a stable partition3 
implementation for Phobos before I work on a pull request. I 
wrote this implementation which is in-place (zero memory 
allocations) but runs in O(n lg n) time.

http://dpaste.dzfl.pl/e12f50ad947d

I found this paper which describes an in-place algorithm with 
O(n) time complexity but it's over my head at the moment.

http://link.springer.com/article/10.1007%2FBF01994842

It is trivial to implement an algorithm with O(n) time complexity 
and O(n) space complexity, but that requires allocating memory. 
Furthermore, using a buffer requires the range to have assignable 
elements.


Any thoughts?
Jul 09 2015
next sibling parent reply "Xinok" <xinok live.com> writes:
On Thursday, 9 July 2015 at 21:57:39 UTC, Xinok wrote:
 I found this paper which describes an in-place algorithm with 
 O(n) time complexity but it's over my head at the moment.

 http://link.springer.com/article/10.1007%2FBF01994842
I apologize, I didn't realize this link was behind a paywall. The paper is freely available here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.5554&rep=rep1&type=pdf
Jul 09 2015
parent reply "Nick B" <nick.barbalich gmail.com> writes:
On Friday, 10 July 2015 at 00:39:16 UTC, Xinok wrote:
 On Thursday, 9 July 2015 at 21:57:39 UTC, Xinok wrote:
 I found this paper which describes an in-place algorithm with 
 O(n) time complexity but it's over my head at the moment.
[snip]
 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.5554&rep=rep1&type=pdf
from the pdf, above, in case readers, like me, don't know the context of a Stable Partition implementation: Stable minimum space partitioning in linear time Jyrki Katajainen1 and Tomi Pasanen2 Abstract. In the stable 0-1 sorting problem the task is to sort an array of n elements with two distinct values such that equal elements retain their relative input order. Recently, Munro, Raman and Salowe [BIT 1990] gave an algorithm which solves this problem in O(nlog*n) 3 time and constant extra space. We show that by a modification of their method the stable 0-1 sorting is possible in O(n) time and O(1) extra space. Stable three-way partitioning can be reduced to stable 0-1 sorting. This immediately yields a stable minimum space quicksort, which sorts multisets in asymptotically optimal time with high probability. CR categories: E.5, F.2.2. The stable 0-1 sorting problem is defined as follows: Given an array of n elements and a function f mapping each element to the set {0,1}, the task is to rearrange the elements such that all elements, whose f-value is zero, become before elements, whose f-value is one. Moreover, the relative order of elements with equal f-values should be maintained. For the sake of simplicity, we hereafter refer to bits instead of the f-values of elements. Stable partitioning is a special case of stable 0-1 sorting, where the f-values are obtained by comparing every element xi to some pivot element xj (which will not take part in partitioning):
Jul 09 2015
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 07/10/2015 03:48 AM, Nick B wrote:
 On Friday, 10 July 2015 at 00:39:16 UTC, Xinok wrote:
 On Thursday, 9 July 2015 at 21:57:39 UTC, Xinok wrote:
 I found this paper which describes an in-place algorithm with O(n)
 time complexity but it's over my head at the moment.
... stable 0-1 sorting is possible in O(n) time and O(1) extra space.
Note how /they/ don't mention "complexity". This is because algorithms don't have "complexities". Problems do. (Sorry, pet peeve of mine.)
Jul 10 2015
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/09/2015 11:57 PM, Xinok wrote:
 I found this paper which describes an in-place algorithm with O(n) time
 complexity but it's over my head at the moment.

 http://link.springer.com/article/10.1007%2FBF01994842

 It is trivial to implement an algorithm with O(n) time complexity and
 O(n) space complexity, but that requires allocating memory. Furthermore,
 using a buffer requires the range to have assignable elements.


 Any thoughts?
I think this method is likely to be less practically relevant than the one they improve upon. (log* n really is constant for all practical purposes, it is the number of times one needs to iteratively take the logarithm until a number below 1 is obtained.) That paper appears to be available here: https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf No idea what the constant is though, I have not read the paper (yet).
Jul 10 2015
parent reply "Xinok" <xinok live.com> writes:
On Friday, 10 July 2015 at 21:26:50 UTC, Timon Gehr wrote:
 I think this method is likely to be less practically relevant 
 than the one they improve upon. (log* n really is constant for 
 all practical purposes, it is the number of times one needs to 
 iteratively take the logarithm until a number below 1 is 
 obtained.)
log* n grows fast for small inputs, so we can't simply ignore it. For example, if n = 2^16 = 65536, then n*lg(n) = 16*2^16 = 2^20 = 1048576.
 That paper appears to be available here: 
 https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf

 No idea what the constant is though, I have not read the paper 
 (yet).
I don't know if there is anything relevant but that paper focuses on a different problem involving sorting.
Jul 10 2015
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 07/11/2015 02:32 AM, Xinok wrote:
 On Friday, 10 July 2015 at 21:26:50 UTC, Timon Gehr wrote:
 I think this method is likely to be less practically relevant than the
 one they improve upon. (log* n really is constant for all practical
 purposes, it is the number of times one needs to iteratively take the
 logarithm until a number below 1 is obtained.)
log* n grows fast for small inputs,
No, it does not. It has no space to grow. With base e, it is 3 or 4 for all input sizes that matter: http://www.wolframalpha.com/input/?i=plot+IteratedLog+from+0+to+2^8 http://www.wolframalpha.com/input/?i=plot+IteratedLog+from+0+to+2^32 http://www.wolframalpha.com/input/?i=plot+IteratedLog+from+0+to+2^64 http://www.wolframalpha.com/input/?i=IteratedLog%282^2^2^2^2%29
 so we can't simply ignore it.
There is a priori no practical difference between a O(n) algorithm and a O(n*log*(n)) algorithm. It all depends on the constants, and it is hence not clear-cut that the O(n) algorithm will run faster. I'm just saying that this should be taken into consideration.
 For example, if n = 2^16 = 65536, then n*lg(n) = 16*2^16 = 2^20 = 1048576.
 ...
The algorithm runs in O(n*log*(n)). It's not n log(n).
 That paper appears to be available here:
 https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf


 No idea what the constant is though, I have not read the paper (yet).
I don't know if there is anything relevant but that paper focuses on a different problem involving sorting.
No, it focuses on a few closely related problems, and the O(n*log*(n))-algorithms solves a problem that is a straightforward generalization of the problem you are looking at. Stable three way partition is stable sorting where there are only three "distinct" values (smaller, equal, greater). The paper you provided builds on top of this algorithm. It is the main reference and part (ii) of the "Algorithm B" part of the O(n)/O(1) algorithm does not occur in the paper you provided, but only in that other paper, so yes, it is relevant. :-)
Jul 10 2015
parent reply "Xinok" <xinok live.com> writes:
On Saturday, 11 July 2015 at 02:56:55 UTC, Timon Gehr wrote:
 ...
 The algorithm runs in O(n*log*(n)). It's not n log(n).
 ...
I understand now. I had never heard of an iterated logarithm before. I took the asterisk to mean some constant, like a wild card if you will. Sorry for the confusion. I wonder if the true O(n) algorithm is still worth pursuing. Although log*(n) may only be a factor of 2 or 3, the O(n) algorithm may have a small constant factor which makes it 2-3x faster.
 That paper appears to be available here:
 https://github.com/lispc/lispc.github.com/raw/master/assets/files/situ.pdf


 No idea what the constant is though, I have not read the 
 paper (yet).
I don't know if there is anything relevant but that paper focuses on a different problem involving sorting.
No, it focuses on a few closely related problems, and the O(n*log*(n))-algorithms solves a problem that is a straightforward generalization of the problem you are looking at. Stable three way partition is stable sorting where there are only three "distinct" values (smaller, equal, greater). The paper you provided builds on top of this algorithm. It is the main reference and part (ii) of the "Algorithm B" part of the O(n)/O(1) algorithm does not occur in the paper you provided, but only in that other paper, so yes, it is relevant. :-)
I get that much now, but this paper is still way above my head. There's some notation in the sample code that I don't understand. In this statement: e <- #{ j : L[j] = x, 1 <= j <= n } I'm not sure what the pound # signifies or what exactly is being assigned to e. I'm assuming it performs some operation on the set, like "max", but I'm not sure what.
Jul 22 2015
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 22 July 2015 at 16:59:29 UTC, Xinok wrote:
 On Saturday, 11 July 2015 at 02:56:55 UTC, Timon Gehr wrote:
 [...]
I understand now. I had never heard of an iterated logarithm before. I took the asterisk to mean some constant, like a wild card if you will. Sorry for the confusion. I wonder if the true O(n) algorithm is still worth pursuing. Although log*(n) may only be a factor of 2 or 3, the O(n) algorithm may have a small constant factor which makes it 2-3x faster.
 [...]
I get that much now, but this paper is still way above my head. There's some notation in the sample code that I don't understand. In this statement: e <- #{ j : L[j] = x, 1 <= j <= n } I'm not sure what the pound # signifies or what exactly is being assigned to e. I'm assuming it performs some operation on the set, like "max", but I'm not sure what.
I haven't looked at the paper, but # followed by a set often means cardinality, i.e. (assuming the set is is finite) the number of elements in the set.
Jul 22 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/9/15 5:57 PM, Xinok wrote:
 I wanted to get some feedback on a stable partition3 implementation for
 Phobos before I work on a pull request. I wrote this implementation
 which is in-place (zero memory allocations) but runs in O(n lg n) time.

 http://dpaste.dzfl.pl/e12f50ad947d

 I found this paper which describes an in-place algorithm with O(n) time
 complexity but it's over my head at the moment.

 http://link.springer.com/article/10.1007%2FBF01994842

 It is trivial to implement an algorithm with O(n) time complexity and
 O(n) space complexity, but that requires allocating memory. Furthermore,
 using a buffer requires the range to have assignable elements.


 Any thoughts?
I'd say both would be nice (not to mention the one in the paper) so how about both are present selectable with a policy ala Yes.tightMemory or something. -- Andrei
Jul 10 2015
parent reply "Xinok" <xinok live.com> writes:
On Saturday, 11 July 2015 at 00:00:47 UTC, Andrei Alexandrescu 
wrote:
 On 7/9/15 5:57 PM, Xinok wrote:
 ...

 Any thoughts?
I'd say both would be nice (not to mention the one in the paper) so how about both are present selectable with a policy ala Yes.tightMemory or something. -- Andrei
I'm hesitant to add yet another template parameter; IMHO, it's bad enough that we have to manually write the predicate just to reach the swap strategy. There's a fourth option I didn't think to mention. It's easy to utilize a small buffer which would be used to partition small sublists instead of insertions. partition3 would not allocate it's own memory so would be in-place by default, but the user could provide their own buffer and pass it as an extra function argument. For example: int[] arr = iota(0, 100000).array; int[] buf = new int[100]; partition3!(...)(arr, 1000, buf); Interestingly, for some constant k, if you implement this algorithm to use O(n/k) space, then it runs in O(n lg n) time because it performs at most O(lg k) rotations. Regarding the algorithm in the paper, I don't have it completely figured out yet because it refers to algorithms in other papers which I can't find.
Jul 10 2015
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/10/15 8:55 PM, Xinok wrote:
 I'm hesitant to add yet another template parameter; IMHO, it's bad
 enough that we have to manually write the predicate just to reach the
 swap strategy.
Then give them different names. -- Andrei
Jul 10 2015