## digitalmars.D - Idempotent partition around median of 5?

• Andrei Alexandrescu (17/17) Feb 03 2016 This appears a simple problem: given numbers a, b, c, d, e, swap them
• Era Scarecrow (7/16) Feb 03 2016 Well if we look at it, the max compares for optimal sorting
• Xinok (16/34) Feb 03 2016 The order of a,b and d,e don't matter because we're partitioning,
• Andrei Alexandrescu (2/4) Feb 03 2016 Thanks very much! -- Andrei
• Era Scarecrow (11/17) Feb 03 2016 Hmmm if i wrote it (this way) it would probably brute force the
• Timon Gehr (4/10) Feb 04 2016 Three swaps are always enough. Using the first swap, put the median
• Timon Gehr (244/261) Feb 04 2016 At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):
• Era Scarecrow (4/21) Feb 04 2016 That's about what i expected for the actual function (like this)
• Xinok (6/11) Feb 04 2016 Great, we can all go home!
• tn (45/48) Feb 05 2016 Inspired by this, I made four other versions of the function that
• Xinok (5/17) Feb 05 2016 Very nice! I'm curious, to both you and Timon, how did you go
• Timon Gehr (16/33) Feb 05 2016 I quickly hacked together a brute-force search.
• Andrei Alexandrescu (2/5) Feb 05 2016 I was thinking the same exact things. -- Andrei
• tn (32/52) Feb 06 2016 I basically just started from Timons solution and made is smaller
• Andrei Alexandrescu (12/34) Feb 06 2016 Whoa, I missed this last night. Awesome work! Thanks! I'm kinda running
• tn (103/108) Feb 06 2016 This now prints the numbers of comparison and swap lines and
• Andrei Alexandrescu (4/17) Feb 06 2016 Awesome, thanks. Will need to try at least a few of these out. At a
• Fool (13/15) Feb 05 2016 One swap usually decomposes into three moves. A cycle of length n
• Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/6) Feb 05 2016 Yes, well, this all breaks down when you realize that this is all
• Era Scarecrow (11/12) Feb 05 2016 Recently from reading some interesting hacks and super code, a
• Xinok (9/21) Feb 05 2016 It's a cool trick but I would be surprised if this were actually
• Andrei Alexandrescu (2/3) Feb 05 2016 That changes every few years. Last time I measured it was slower. -- And...
• Timon Gehr (3/17) Feb 05 2016 The goal is to partition though, not to sort.
• Fool (2/4) Feb 05 2016 Indeed. I was wondering about that. Great job!
• Andrei Alexandrescu (8/9) Feb 05 2016 Timon, Ivan, this is amazing work. I don't know how your minds work
• Andrei Alexandrescu (3/4) Feb 05 2016 What is the minimum number of comparisons? Thx! -- Andrei
• Timon Gehr (4/8) Feb 05 2016 There is no partition algorithm that uses <= 5 comparisons in all cases.
• Timon Gehr (23/36) Feb 06 2016 All versions posted in this thread do 6 comparisons on all code paths.
• Andrei Alexandrescu (10/13) Feb 06 2016 Thanks. Tried this just now, it's better than the pre-discussion
• Andrei Alexandrescu (5/6) Feb 05 2016 Oh, also: could you let that bad boy run and let it find anything that
• tn (7/15) Feb 06 2016 That is kind of what I tried to do by hand. My function
• Ivan Kazmenko (58/62) Feb 05 2016 Here's my take at it.
```This appears a simple problem: given numbers a, b, c, d, e, swap them
around so as to place the median in c and partition the others around
it. I.e. the postcondition is: a <= c && b <= c && c <= d && c <= e.

Searching the Net for this isn't easy. Fortunately "our own" Xinok has
the best of breed result, see
http://stackoverflow.com/questions/11065963/possible-to-partition-five-elements-by-median-with-six-comparisons.

His algorithm is a little suboptimal in that when given a sorted array,
it sometimes leaves it unsorted. So I changed it to
http://dpaste.dzfl.pl/5fb2238d9891 to fix that. If I count right, it
also saves one swap: does 0-9 swaps instead of 0-10.

Ideally however, such an algorithm would do 0 swaps if the median is
already in c. My algorithm may still swap d and e without necessity.

So there's got to be a better solution. Your challenge - should you
choose to accept it :o) - is an algorithm that does the partitioning in
6 comparisons and <= 9 swaps, which is idempotent: when applied twice,
it always does 0 swaps.

Andrei
```
Feb 03 2016
```On Thursday, 4 February 2016 at 01:24:15 UTC, Andrei Alexandrescu
wrote:
This appears a simple problem: given numbers a, b, c, d, e,
swap them around so as to place the median in c and partition
the others around it. I.e. the postcondition is: a <= c && b <=
c && c <= d && c <= e.

<snip>

So there's got to be a better solution. Your challenge - should
you choose to accept it :o) - is an algorithm that does the
partitioning in 6 comparisons and <= 9 swaps, which is
idempotent: when applied twice, it always does 0 swaps.

Well if we look at it, the max compares for optimal sorting
algorithm is the log2 of the max combinations in how the elements
could be arranged; !5 = 120 or 7 compares.

I'm not sure of the max swaps, seems like it could be 4 if done
right, but that might be impractical.
```
Feb 03 2016
```On Thursday, 4 February 2016 at 01:33:54 UTC, Era Scarecrow wrote:
On Thursday, 4 February 2016 at 01:24:15 UTC, Andrei
Alexandrescu wrote:
This appears a simple problem: given numbers a, b, c, d, e,
swap them around so as to place the median in c and partition
the others around it. I.e. the postcondition is: a <= c && b
<= c && c <= d && c <= e.

<snip>

So there's got to be a better solution. Your challenge -
should you choose to accept it :o) - is an algorithm that does
the partitioning in 6 comparisons and <= 9 swaps, which is
idempotent: when applied twice, it always does 0 swaps.

Well if we look at it, the max compares for optimal sorting
algorithm is the log2 of the max combinations in how the
elements could be arranged; !5 = 120 or 7 compares.

The order of a,b and d,e don't matter because we're partitioning,
not sorting. So this problem can be solved in six comparisons.

I'm not sure of the max swaps, seems like it could be 4 if
done right, but that might be impractical.

I believe that's correct. Each swap can move at least one element
into it's final position. After three swaps, there may be two
elements which are still out of place, and swapping them will
move them both into their final position. Thus, four swaps max.
:-)

A solution to this problem does exist: Build a tree of if-else
branches for all the different possibilities (2^6=64) and
hard-code the optimal sequence of swaps. However, this function
would be huge so this isn't very practical. Ideally, we want to
minimize the size of the function as much as possible, adding in
a few extra swaps if necessary.

I'll take a crack at it this weekend. It sounds like an
interesting problem.
```
Feb 03 2016
```On 02/03/2016 09:46 PM, Xinok wrote:
I'll take a crack at it this weekend. It sounds like an interesting
problem.

Thanks very much! -- Andrei
```
Feb 03 2016
```On Thursday, 4 February 2016 at 02:46:42 UTC, Xinok wrote:
A solution to this problem does exist: Build a tree of if-else
branches for all the different possibilities (2^6=64) and
hard-code the optimal sequence of swaps. However, this function
would be huge so this isn't very practical.

Yeah that's kinda what i was coming to a conclusion to.

Ideally, we want to minimize the size of the function as much
as possible, adding in a few extra swaps if necessary.

Hmmm if i wrote it (this way) it would probably brute force the
proper way to build the function rather than doing it by hand
myself. So a helper function to make the optimal if/else
statements to deal with it.

BUT this only makes sense if it's always going to be 5 elements,
if it's fewer or more, then it's sorta pointless and a more
generic algorithm would be preferred.

Although writing a brute force program creation helper function
could still be of some use.
```
Feb 03 2016    Timon Gehr <timon.gehr gmx.ch> writes:
```On 02/04/2016 03:46 AM, Xinok wrote:
I'm not sure of the max swaps, seems like it could be 4 if done
right, but that might be impractical.

I believe that's correct. Each swap can move at least one element into
it's final position. After three swaps, there may be two elements which
are still out of place, and swapping them will move them both into their
final position. Thus, four swaps max. :-)

Three swaps are always enough. Using the first swap, put the median
where it belongs, and each subsequent swap can be chosen such that it
moves two elements to their final position.
```
Feb 04 2016
```On 02/04/2016 02:24 AM, Andrei Alexandrescu wrote:
This appears a simple problem: given numbers a, b, c, d, e, swap them
around so as to place the median in c and partition the others around
it. I.e. the postcondition is: a <= c && b <= c && c <= d && c <= e.

Searching the Net for this isn't easy. Fortunately "our own" Xinok has
the best of breed result, see
http://stackoverflow.com/questions/11065963/possible-to-partition-five-elements-by-median-with-six-comparisons.

His algorithm is a little suboptimal in that when given a sorted array,
it sometimes leaves it unsorted. So I changed it to
http://dpaste.dzfl.pl/5fb2238d9891 to fix that. If I count right, it
also saves one swap: does 0-9 swaps instead of 0-10.

Ideally however, such an algorithm would do 0 swaps if the median is
already in c. My algorithm may still swap d and e without necessity.

So there's got to be a better solution. Your challenge - should you
choose to accept it :o) - is an algorithm that does the partitioning in
6 comparisons and <= 9 swaps, which is idempotent: when applied twice,
it always does 0 swaps.

Andrei

At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

void partition5(ref int a){
if(a<a){
if(a<a){
if(a<a){
if(a<a){
if(a<a){
if(!(a<a)){
swap(a,a);
}
}else if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
if(a<a){
if(!(a<a)){
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
swap(a,a);
}else{swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}else if(a<a){
if(a<a){
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
swap(a,a);
}
}else if(a<a){
swap(a,a);
swap(a,a);
swap(a,a);
}else{
swap(a,a);
swap(a,a);
}
}
```
Feb 04 2016
```On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of
swaps):

void partition5(ref int a){
if(a<a){
if(a<a){
if(a<a){
if(a<a){
if(a<a){
if(!(a<a)){
swap(a,a);
}
}else if(a<a){
swap(a,a);
}else{
swap(a,a);
swap(a,a);
<snip>

That's about what i expected for the actual function (like this)
to look like. Course the only way to test this is to brute force
the combinations and confirm it's all in the order they should be.
```
Feb 04 2016
```On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of
swaps):

void partition5(ref int a){
if(a<a){
...

Great, we can all go home!

I was curious so I did a crude measurement: when compiled for
64-bit with all optimizations turned on (-release -O -inline
-boundscheck=off), the machine code for this function is about
3KB in size.
```
Feb 04 2016
```On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of
swaps):

...

Inspired by this, I made four other versions of the function that
are shorter but make more swaps (at most 4, 6, 7 and 8 swaps
respectively). Each makes 6 comparisons and should be idempotent.

http://dpaste.dzfl.pl/1c53d8f432f7

Here are the distributions of the number of swaps when tested
with all 120 possible permutations as the input:

testing 'partition5a' (by Timon Gehr)
0 swaps: 4 orderings
1 swaps: 32 orderings
2 swaps: 68 orderings
3 swaps: 16 orderings
testing 'partition5b'
0 swaps: 4 orderings
1 swaps: 20 orderings
2 swaps: 42 orderings
3 swaps: 40 orderings
4 swaps: 14 orderings
testing 'partition5c'
0 swaps: 4 orderings
1 swaps: 14 orderings
2 swaps: 25 orderings
3 swaps: 30 orderings
4 swaps: 26 orderings
5 swaps: 16 orderings
6 swaps: 5 orderings
testing 'partition5d'
0 swaps: 4 orderings
1 swaps: 14 orderings
2 swaps: 24 orderings
3 swaps: 29 orderings
4 swaps: 26 orderings
5 swaps: 16 orderings
6 swaps: 6 orderings
7 swaps: 1 orderings
testing 'partition5e'
0 swaps: 4 orderings
1 swaps: 12 orderings
2 swaps: 19 orderings
3 swaps: 25 orderings
4 swaps: 25 orderings
5 swaps: 18 orderings
6 swaps: 11 orderings
7 swaps: 5 orderings
8 swaps: 1 orderings
```
Feb 05 2016
```On Friday, 5 February 2016 at 15:13:56 UTC, tn wrote:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number
of swaps):

...

Inspired by this, I made four other versions of the function
that are shorter but make more swaps (at most 4, 6, 7 and 8
swaps respectively). Each makes 6 comparisons and should be
idempotent.

http://dpaste.dzfl.pl/1c53d8f432f7

...

Very nice! I'm curious, to both you and Timon, how did you go
about writing these and coming up with the solutions? I'm not
sure if I could come up with these results myself and so quickly
at that.
```
Feb 05 2016
```On 02/05/2016 10:42 PM, Xinok wrote:
On Friday, 5 February 2016 at 15:13:56 UTC, tn wrote:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

...

Inspired by this, I made four other versions of the function that are
shorter but make more swaps (at most 4, 6, 7 and 8 swaps
respectively). Each makes 6 comparisons and should be idempotent.

http://dpaste.dzfl.pl/1c53d8f432f7

...

Very nice! I'm curious, to both you and Timon, how did you go about
writing these and coming up with the solutions? I'm not sure if I could
come up with these results myself and so quickly at that.

I quickly hacked together a brute-force search.
Code: http://dpaste.dzfl.pl/43503913ac1d

The search only makes sure that it works for distinct input values.
Other input orders come for free; the verification code I have included
checks this.

The code recursively splits the set of all permutations on 5 elements by
all possible comparisons between two elements up to a maximal depth of 6
comparisons, using some basic pruning. The recursion terminates if all
permutations in the set can be partitioned using the same swaps. (I.e.,
the index of the median is fixed and there are exactly two possible
indices for the first two elements, and exactly two possible indices for
the last two elements.)

The partition5 function I have posted is the first result that the
search found. The above code also supports optimizing for source code
size (-version=SHORT). The result is not that much shorter though.
```
Feb 05 2016
```On 02/05/2016 04:42 PM, Xinok wrote:
Very nice! I'm curious, to both you and Timon, how did you go about
writing these and coming up with the solutions? I'm not sure if I could
come up with these results myself and so quickly at that.

I was thinking the same exact things. -- Andrei
```
Feb 05 2016    tn <no email.com> writes:
```On Friday, 5 February 2016 at 21:42:41 UTC, Xinok wrote:
On Friday, 5 February 2016 at 15:13:56 UTC, tn wrote:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number
of swaps):

...

Inspired by this, I made four other versions of the function
that are shorter but make more swaps (at most 4, 6, 7 and 8
swaps respectively). Each makes 6 comparisons and should be
idempotent.

http://dpaste.dzfl.pl/1c53d8f432f7

...

Very nice! I'm curious, to both you and Timon, how did you go
about writing these and coming up with the solutions? I'm not
sure if I could come up with these results myself and so
quickly at that.

I basically just started from Timons solution and made is smaller
by replacing branches by swaps. The outermost structure of Timons
code looks like this:

if (a < a) {
do something 1
} else {
do something 2
}

I replaced the above by

if (a < a) {
swap(a, a);
}
do something 1

So adding one swap just about halved the size of the code. Now
"do something 1" again has two branches:

if (a < a) {
do something 1.1
} else {
do something 1.2
}

So we can do the same trick again and replace it by

if (a < a) {
swap(a, a);
}
do something 1.1

And so on. (When doing subsequent swaps we also need to make sure
to not break the conditions achieved by the previous swaps.)

However, the swap of a and a in the first replacement would
break idempotence, so I also had to change which elements are
compared (and swapped). So in my code we first compare a and
a, then a and a, and so on.
```
Feb 06 2016
```On 02/05/2016 10:13 AM, tn wrote:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

...

Inspired by this, I made four other versions of the function that are
shorter but make more swaps (at most 4, 6, 7 and 8 swaps respectively).
Each makes 6 comparisons and should be idempotent.

http://dpaste.dzfl.pl/1c53d8f432f7

Whoa, I missed this last night. Awesome work! Thanks! I'm kinda running
out of time budget here for this particular subproblem, but I've got to
do these justice and try them out as well.

Here are the distributions of the number of swaps when tested with all
120 possible permutations as the input:

testing 'partition5a' (by Timon Gehr)
0 swaps: 4 orderings
1 swaps: 32 orderings
2 swaps: 68 orderings
3 swaps: 16 orderings
testing 'partition5b'
0 swaps: 4 orderings
1 swaps: 20 orderings
2 swaps: 42 orderings
3 swaps: 40 orderings
4 swaps: 14 orderings

[...]

distribution of data, compute the average number of swaps as a weighted
average of orderings. Also, show the number of lines (one test or one
swap per line) as a proxy for generated code size. That should provide
good insights.

Thanks!

Andrei
```
Feb 06 2016
```On Saturday, 6 February 2016 at 13:06:37 UTC, Andrei Alexandrescu
wrote:
random distribution of data, compute the average number of
swaps as a weighted average of orderings. Also, show the number
of lines (one test or one swap per line) as a proxy for
generated code size. That should provide good insights.

This now prints the numbers of comparison and swap lines and
computes the average numbers of swaps. In addition, it also runs
the same tests for permutations of 5 elements some of which might
be equal (an example of such permutation would be [1, 2, 0, 2,
3], on the other hand [1, 2, 0, 2, 4] should not be included
since it is identical). However, in this case the average number
of swaps is perhaps not so meaningful.

http://dpaste.dzfl.pl/2012caf872ec

Output:

testing 'partition5a'
Code: comparisons = 63, swaps = 107, total = 170
With all orderings of distinct elements:
0 swaps: 4 instances
1 swaps: 32 instances
2 swaps: 68 instances
3 swaps: 16 instances
Average number of swaps: 1.8
With all orderings of potentially nondistinct elements:
Error, not idempotent: [0, 0, 0, 0, 0] => [0, 0, 0, 0, 0]
testing 'partition5b'
Code: comparisons = 17, swaps = 21, total = 38
With all orderings of distinct elements:
0 swaps: 4 instances
1 swaps: 20 instances
2 swaps: 42 instances
3 swaps: 40 instances
4 swaps: 14 instances
Average number of swaps: 2.33333
With all orderings of potentially nondistinct elements:
0 swaps: 36 instances
1 swaps: 122 instances
2 swaps: 200 instances
3 swaps: 146 instances
4 swaps: 37 instances
Average number of swaps: 2.04806
testing 'partition5c'
Code: comparisons = 10, swaps = 12, total = 22
With all orderings of distinct elements:
0 swaps: 4 instances
1 swaps: 14 instances
2 swaps: 25 instances
3 swaps: 30 instances
4 swaps: 26 instances
5 swaps: 16 instances
6 swaps: 5 instances
Average number of swaps: 3.06667
With all orderings of potentially nondistinct elements:
0 swaps: 36 instances
1 swaps: 92 instances
2 swaps: 130 instances
3 swaps: 138 instances
4 swaps: 92 instances
5 swaps: 42 instances
6 swaps: 11 instances
Average number of swaps: 2.60628
testing 'partition5d'
Code: comparisons = 7, swaps = 8, total = 15
With all orderings of distinct elements:
0 swaps: 4 instances
1 swaps: 14 instances
2 swaps: 24 instances
3 swaps: 29 instances
4 swaps: 26 instances
5 swaps: 16 instances
6 swaps: 6 instances
7 swaps: 1 instances
Average number of swaps: 3.13333
With all orderings of potentially nondistinct elements:
0 swaps: 36 instances
1 swaps: 100 instances
2 swaps: 128 instances
3 swaps: 133 instances
4 swaps: 94 instances
5 swaps: 39 instances
6 swaps: 10 instances
7 swaps: 1 instances
Average number of swaps: 2.57486
testing 'partition5e'
Code: comparisons = 6, swaps = 8, total = 14
With all orderings of distinct elements:
0 swaps: 4 instances
1 swaps: 12 instances
2 swaps: 19 instances
3 swaps: 25 instances
4 swaps: 25 instances
5 swaps: 18 instances
6 swaps: 11 instances
7 swaps: 5 instances
8 swaps: 1 instances
Average number of swaps: 3.53333
With all orderings of potentially nondistinct elements:
0 swaps: 36 instances
1 swaps: 88 instances
2 swaps: 100 instances
3 swaps: 123 instances
4 swaps: 106 instances
5 swaps: 53 instances
6 swaps: 25 instances
7 swaps: 9 instances
8 swaps: 1 instances
Average number of swaps: 2.89649
```
Feb 06 2016
```On 02/06/2016 01:42 PM, tn wrote:
On Saturday, 6 February 2016 at 13:06:37 UTC, Andrei Alexandrescu wrote:
distribution of data, compute the average number of swaps as a
weighted average of orderings. Also, show the number of lines (one
test or one swap per line) as a proxy for generated code size. That
should provide good insights.

This now prints the numbers of comparison and swap lines and computes
the average numbers of swaps. In addition, it also runs the same tests
for permutations of 5 elements some of which might be equal (an example
of such permutation would be [1, 2, 0, 2, 3], on the other hand [1, 2,
0, 2, 4] should not be included since it is identical). However, in this
case the average number of swaps is perhaps not so meaningful.

http://dpaste.dzfl.pl/2012caf872ec

Awesome, thanks. Will need to try at least a few of these out. At a
quick glance, partition5d seems to be the sweet spot - it's small and
does only 3.13/2.57 swaps on average. -- Andrei
```
Feb 06 2016
```On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of
swaps)[...]

One swap usually decomposes into three moves. A cycle of length n
can be sorted with n + 1 moves. Corresponding to the seven
integer partitions of 5 we obtain:

5: 6
4 + 1: 5
3 + 2: 4 + 3
3 + 1 + 1: 4
2 + 2 + 1: 3 + 3
2 + 1 + 1 + 1: 3
1 + 1 + 1 + 1 + 1: 0

So we can do with seven moves instead of three swaps (nine
moves). ;-)
```
Feb 05 2016
```On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
So we can do with seven moves instead of three swaps (nine
moves). ;-)

Yes, well, this all breaks down when you realize that this is all
completely dominated by cache misses and that you can do median
more effectively using SIMD instructions. :^P
```
Feb 05 2016
```On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
One swap usually decomposes into three moves.

Recently from reading some interesting hacks and super code, a
good swap can also be done via 3 xor operations (and avoiding an
intermediate storage unit).

http://aggregate.ee.engr.uky.edu/MAGIC/#Swap%20Values%20Without%20a%20Temporary

x ^= y;
y ^= x;
x ^= y;

Since these can be used directly as registers it might be faster
(assuming all 5 are mapped to registers until the final write
out, which might not work).
```
Feb 05 2016
```On Friday, 5 February 2016 at 17:33:55 UTC, Era Scarecrow wrote:
On Friday, 5 February 2016 at 15:41:11 UTC, Fool wrote:
One swap usually decomposes into three moves.

Recently from reading some interesting hacks and super code, a
good swap can also be done via 3 xor operations (and avoiding
an intermediate storage unit).

http://aggregate.ee.engr.uky.edu/MAGIC/#Swap%20Values%20Without%20a%20Temporary

x ^= y;
y ^= x;
x ^= y;

Since these can be used directly as registers it might be
faster (assuming all 5 are mapped to registers until the final
write out, which might not work).

It's a cool trick but I would be surprised if this were actually
faster. Modern x64 processors have 16 "general purpose" registers
so saving a single register for a simple set of instructions is
likely to not have any benefit. My other thought is that the
compiler may not recognize this pattern, making it more difficult
to optimize. On the other hand, compilers are very good at
rearranging simple reads and writes to avoid stalling the
pipeline in the CPU.
```
Feb 05 2016
```On 02/05/2016 04:35 PM, Xinok wrote:
It's a cool trick but I would be surprised if this were actually faster.

That changes every few years. Last time I measured it was slower. -- Andrei
```
Feb 05 2016
```On 02/05/2016 04:41 PM, Fool wrote:
On Thursday, 4 February 2016 at 20:30:57 UTC, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of
swaps)[...]

One swap usually decomposes into three moves. A cycle of length n can be
sorted with n + 1 moves. Corresponding to the seven integer partitions
of 5 we obtain:

5: 6
4 + 1: 5
3 + 2: 4 + 3
3 + 1 + 1: 4
2 + 2 + 1: 3 + 3
2 + 1 + 1 + 1: 3
1 + 1 + 1 + 1 + 1: 0

So we can do with seven moves instead of three swaps (nine moves). ;-)

The goal is to partition though, not to sort.
Six moves are sufficient. (And necessary. E.g. for 42310.)
```
Feb 05 2016
```On Friday, 5 February 2016 at 21:48:58 UTC, Timon Gehr wrote:
The goal is to partition though, not to sort.
Six moves are sufficient. (And necessary. E.g. for 42310.)

Indeed. I was wondering about that. Great job!
```
Feb 05 2016
```On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

Timon, Ivan, this is amazing work. I don't know how your minds work
folks - I sat for like two hours on it yesterday and couldn't crack it.

Surprisingly in a test on millions of random numbers, Ivan's version
wins by a hair. Might be the tighter code which has icache effects.

Thanks very much! I should say that at these point no better definitions
can be found on the Net.

Andrei
```
Feb 05 2016
```On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

What is the minimum number of comparisons? Thx! -- Andrei
P.S. The sythesized searcher is genius.
```
Feb 05 2016
```On 02/06/2016 02:39 AM, Andrei Alexandrescu wrote:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

What is the minimum number of comparisons? Thx! -- Andrei

There is no partition algorithm that uses <= 5 comparisons in all cases.
(If that is what you're asking.)

P.S. The sythesized searcher is genius.

Thanks!
```
Feb 05 2016
```On 02/05/2016 08:58 PM, Timon Gehr wrote:
On 02/06/2016 02:39 AM, Andrei Alexandrescu wrote:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

What is the minimum number of comparisons? Thx! -- Andrei

There is no partition algorithm that uses <= 5 comparisons in all cases.
(If that is what you're asking.)

comparisons is 6, but it would be good to know the average number of
comparisons. I know I could read through the code, but it's hairy.
Thanks! -- Andrei
```
Feb 06 2016
```On 02/06/2016 02:00 PM, Andrei Alexandrescu wrote:
On 02/05/2016 08:58 PM, Timon Gehr wrote:
On 02/06/2016 02:39 AM, Andrei Alexandrescu wrote:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

What is the minimum number of comparisons? Thx! -- Andrei

There is no partition algorithm that uses <= 5 comparisons in all cases.
(If that is what you're asking.)

comparisons is 6, but it would be good to know the average number of
comparisons. I know I could read through the code, but it's hairy.
Thanks! -- Andrei

All versions posted in this thread do 6 comparisons on all code paths.
(It is not possible to do better.)

BTW, the code I had posted earlier suffers from the following flaws:

- Branches were cut off too early, so -version=SHORT didn't actually
find the shortest code. 

- The code (by necessity) only performs the optimal number of swaps if
all input elements are different. While it leaves already partitioned
integer arrays in the same state as they were encountered, if multiple
input elements are equal, the generated algorithm will sometimes swap
them, violating idempotence if input elements can compare equal but be
different. However, tn's versions fix this: they never perform any swaps
if the input array is already partitioned.

 This seems to be the shortest code that satisfies the specification
I have given (<=6 comparisons, optimal number of swaps) for permutations
and that performs all the comparing before all the swapping:

void partition5(ref int a){

if(a<a){if(a<a){if(a<a){if(a<a){if(a<a){if(a<a){swap(a,a);}}else{if(a<a){swap(a,a);}else{swap(a,a);swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);}else{swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}}}else{if(a<a){if(a<a){if(a<a){swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}}}}else{if(a<a){if(a<a){if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);}else{swap(a,a
);swap(a,a);}}}}else{if(a<a){if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);swap(a,a);}}}}}}else{if(a<a){if(a<a){if(a<a){if(a<a){if(a<a){swap(a,a);}else{swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);}else{swap(a,a);swap(a,a);}}}}else{if(a<a){if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);s
wap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}}}}else{if(a<a){if(a<a){if(a<a){if(a<a){swap(a,a);}else{swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);}else{swap(a,a);swap(a,a);}}}}else{if(a<a){if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);swap(a,a);}}}else{if(a<a){if(a<a){swap(a,a);swap(a,a);}else{swap(a,a);swap(a,a);}}else{if(a<a){swap(a,a);swap(a,a);swap(a,a);}else{swap(a,a);sw
ap(a,a);}}}}}}
}
```
Feb 06 2016
```On 2/6/16 9:26 AM, Timon Gehr wrote:
 This seems to be the shortest code that satisfies the specification
I have given (<=6 comparisons, optimal number of swaps) for permutations
and that performs all the comparing before all the swapping:

Thanks. Tried this just now, it's better than the pre-discussion
baselines but Ivan's still beats it (by a little). Then I just tried
tn's partition5c which beats Ivan's.

I should also add that returns are starting to diminish. Idempotence did
make a large difference. Then reducing max swaps made a smaller
difference (at least for the uints I'm working with).

BTW my testbed is a careful implementation of BFPRT on random arrays of
uint of various sizes.

Andrei
```
Feb 06 2016
```On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number of swaps):

Oh, also: could you let that bad boy run and let it find anything that
does idempotent partition in 6 compares and 0-7 swaps? Then slowly
tighten the number of swaps until you find the best balance between
number of swaps and code size. -- Andrei
```
Feb 05 2016
```On Saturday, 6 February 2016 at 01:46:58 UTC, Andrei Alexandrescu
wrote:
On 02/04/2016 03:30 PM, Timon Gehr wrote:
At most 6 comparisons, <=3 swaps, idempotent (optimal number
of swaps):

Oh, also: could you let that bad boy run and let it find
anything that does idempotent partition in 6 compares and 0-7
swaps? Then slowly tighten the number of swaps until you find
the best balance between number of swaps and code size. --
Andrei

That is kind of what I tried to do by hand. My function
partition5e seems to be practically identical to Ivans solution
and partition5a is just a copy of Timons solution. Function
partition5b, partition5c and partition5d are in between of those
with various tradeoffs between the number of swaps and code size.
```
Feb 06 2016
```On Thursday, 4 February 2016 at 01:24:15 UTC, Andrei Alexandrescu
wrote:
So there's got to be a better solution. Your challenge - should
you choose to accept it :o) - is an algorithm that does the
partitioning in 6 comparisons and <= 9 swaps, which is
idempotent: when applied twice, it always does 0 swaps.

Here's my take at it.

It's idempotent and does exactly 6 comparisons and 0-8 swaps.
The advantages to counter the not-so-good stats are that it has a
flat structure, and is not hard to reason about.

The idea is to process the following steps:

1. Find the minimum of {b, c, d, e} in three comparisons, and put
it into b.
The structure is: b < d, c < e, b < c.
Note that d and e were not compared if no swaps were made.

2. Find the minimum of {a, c, d, e} in two more comparisons, and
put it into a.
The structure is: a < d, c < e, a < c.
Note that a and b were not compared if no swaps were made.

3. Find the minimum of {c, d, e} in one more comparison, and put
it into c.
The structure is: c < d, c < e.
Note that d and e were not compared if no swaps were made.

In the end, we have a < c, b < c, c < d, c < e.
Additionally, if these inequalities held initially, everything is
left in place regardless of the results of comparisons of (a, b)
and (d, e).

In code:
-----
import std.algorithm;
import std.exception;
import std.range;
import std.stdio;

void p5 (ref int a, ref int b, ref int c, ref int d, ref int e)
{
if (d < b) {swap (b, d);}
if (e < c) {swap (c, e);}
if (c < b) {swap (b, c); swap (d, e);}
if (d < a) {swap (a, d);}
if (c < a) {swap (a, c); swap (d, e);}
if (d < c) {swap (c, d);}
}

void main ()
{
auto a = 5.iota.array;
do
{
auto b = a.dup;
p5 (b, b, b, b, b);
auto c = b.dup;
p5 (c, c, c, c, c);
writeln (a, " -> ", b, " -> ", c);
enforce (b < b && b < b &&
b < b && b < b);
enforce (equal (b, c));
}
while (nextPermutation (a));
}
-----

Another interesting task would be to make the function stable,
but I don't see how it is possible with such flat structure.

Ivan Kazmenko.
```
Feb 05 2016
```On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
Another interesting task would be to make the function stable, but I
don't see how it is possible with such flat structure.

Under what circumstances isn't your function unstable? -- Andrei
```
Feb 05 2016
```On 02/05/2016 07:59 PM, Andrei Alexandrescu wrote:
On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
Another interesting task would be to make the function stable, but I
don't see how it is possible with such flat structure.

Under what circumstances isn't your function unstable? -- Andrei

I meant "is your function unstable". Double negation, but if you speak
Russian those count as one :o). -- Andrei
```
Feb 05 2016
```On Saturday, 6 February 2016 at 00:59:17 UTC, Andrei Alexandrescu
wrote:
On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
Another interesting task would be to make the function stable,
but I don't see how it is possible with such flat structure.

Under what circumstances isn't your function unstable? -- Andrei

For example, if elements are "value | id", compared by value,
then:
0|0, 0|1, 1|2, 0|3, 0|4
is transformed into
0|0, 0|1, 0|4, 0|3, 1|2
and 0|4 is now placed earlier than 0|3, which violates the
stability requirement.
Things can be reordered a bit, but it seems that no possible
order eliminates the need to remember a part of the history.

On the other hand, when we track our whole path in the decision
tree (e.g. at the leaves of Timon Gehr's tree of ifs), we have
all information to make the partition stable.

In any case, finding the shortest-code stable partition-of-5
algorithm looks like a problem solvable by an automated searcher.

Regarding the speed, there are different use cases with different
requirements, for example:

1. Primitive types (cheap swap, cheap comparison).

2. Heavy structs A (expensive swap, cheap comparison - e.g.,
compare one field of primitive type).

3. Heavy structs B (expensive swap, expensive comparison - e.g.,
call a heavy external function).

4. Heavy classes (cheap swap - pointers only, expensive
comparison).

So there's perhaps no single best solution.
```
Feb 05 2016
```On Saturday, 6 February 2016 at 07:06:27 UTC, Ivan Kazmenko wrote:
On Saturday, 6 February 2016 at 00:59:17 UTC, Andrei
Alexandrescu wrote:
On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
Another interesting task would be to make the function
stable, but I don't see how it is possible with such flat
structure.

Under what circumstances isn't your function unstable? --
Andrei

The code I used to check stability:
http://dpaste.dzfl.pl/2b950cb3e5d8

The "y" and "n" at end of lines are "yes"/"no", as in
"stable"/"unstable".
```
Feb 05 2016
```On Saturday, 6 February 2016 at 07:06:27 UTC, Ivan Kazmenko wrote:
1. Primitive types (cheap swap, cheap comparison).

2. Heavy structs A (expensive swap, cheap comparison - e.g.,
compare one field of primitive type).

3. Heavy structs B (expensive swap, expensive comparison -
e.g., call a heavy external function).

4. Heavy classes (cheap swap - pointers only, expensive
comparison).

So there's perhaps no single best solution.

That's right, but other factors are more important: preventing
pipeline stalls. If you are collecting from 5 different
cachelines in an array you are likely to get several 40-120
cycles delays unless you do prefetching, and if you do, you need
to have other instructions to fill in the latency gaps.

But also instructions have latency and concurrency issues. Which
compares/swaps independent so that they could be concurrently
scheduled.

Yet, Haswell has SIMD instructions that can do 8-16x 32-bit
max/min operations per cycle, with a latency of only 1 cycle, and
4-8x 64bit compares with a latency of 1 cycle.

So if you use as  small fixed N, like 5, it makes very little
sense to count compares/swaps.

What makes sense is to focus on how you can avoid branching and
build an algorithm with no pipeline stalls.

If sorting large arrays you also might want to look at
```
Feb 06 2016    Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
```On 02/06/2016 02:06 AM, Ivan Kazmenko wrote:
On Saturday, 6 February 2016 at 00:59:17 UTC, Andrei Alexandrescu wrote:
On 02/05/2016 06:36 AM, Ivan Kazmenko wrote:
Another interesting task would be to make the function stable, but I
don't see how it is possible with such flat structure.

Under what circumstances isn't your function unstable? -- Andrei

For example, if elements are "value | id", compared by value, then:
0|0, 0|1, 1|2, 0|3, 0|4
is transformed into
0|0, 0|1, 0|4, 0|3, 1|2
and 0|4 is now placed earlier than 0|3, which violates the stability
requirement.
Things can be reordered a bit, but it seems that no possible order
eliminates the need to remember a part of the history.

On the other hand, when we track our whole path in the decision tree
(e.g. at the leaves of Timon Gehr's tree of ifs), we have all
information to make the partition stable.

In any case, finding the shortest-code stable partition-of-5 algorithm
looks like a problem solvable by an automated searcher.

Yah. I think stability should be solved if there's a need/application
for it, e.g. is there a larger algorithm that would be stable if
partition5 were stable?

Regarding the speed, there are different use cases with different
requirements, for example:

1. Primitive types (cheap swap, cheap comparison).

2. Heavy structs A (expensive swap, cheap comparison - e.g., compare one
field of primitive type).

3. Heavy structs B (expensive swap, expensive comparison - e.g., call a
heavy external function).

4. Heavy classes (cheap swap - pointers only, expensive comparison).

So there's perhaps no single best solution.

Yah, good point. I should also add that more comparisons generally mean
more branching and more code. -- Andrei
```
Feb 06 2016