|
Archives
D Programming
digitalmars.D
digitalmars.D.bugs
digitalmars.D.dtl
digitalmars.D.ide
digitalmars.D.dwt
digitalmars.D.announce
digitalmars.D.learn
digitalmars.D.debugger
D.gnu
D
C/C++ Programming
c++
c++.announce
c++.atl
c++.beta
c++.chat
c++.command-line
c++.dos
c++.dos.16-bits
c++.dos.32-bits
c++.idde
c++.mfc
c++.rtl
c++.stl
c++.stl.hp
c++.stl.port
c++.stl.sgi
c++.stlsoft
c++.windows
c++.windows.16-bits
c++.windows.32-bits
c++.wxwindows
digitalmars.empire
digitalmars.DMDScript
electronics
|
digitalmars.D.announce - Sorting algorithms benchmark
I've written a program to benchmark a number of sorting algorithms.
http://pr.stewartsplace.org.uk/d/sortbench.d
It includes options to output the timings to a CSV file and to control the
number of tests of each algorithm to perform (likely to be needed on modern
machines).
Stewart.
Stewart Gordon wrote:
I've written a program to benchmark a number of sorting algorithms.
http://pr.stewartsplace.org.uk/d/sortbench.d
It includes options to output the timings to a CSV file and to control the
number of tests of each algorithm to perform (likely to be needed on modern
machines).
Stewart.
Nice. One of the very first D programs I wrote was a class Sorter which aimed to
do the same, but I never published it since most of the code was just pretty
much directly copied from somewhere. Besides, I didn't really understand most of
the algorithms, and couldn't get all the ones I wanted to work. Pigeonhole sort,
in particular, couldn't reliably work for all types without opAssign
overloading, which was necessary.
Some comments from what I do (think I do) know, however:
Your comb sort uses a shrink factor of 0.7. Comb sort 11 uses 1 / (1 - 1 / (e ^
phi)), where e is Napier's constant and phi the golden ratio. This leads to
possible optimization in the algorithm.
But alas, the page that used to contain the article describing it is gone from
the web: http://world.std.com/~jdveale/combsort.htm. Dammit! I haven't copied
the source code for it from there, either. Oh well, a less optimal version but
still, I wager, better than yours:
http://yagni.com/combsort/index.php#cpp-combsort
What I used: const real SHRINK_FACTOR =
1.2473309501039786540366528676643474234433714124826;
You're missing shell sort: http://en.wikipedia.org/wiki/Shell_sort
See also: http://www.cs.princeton.edu/~rs/talks/shellsort.pdf
My 1.5-year old code (no idea how it works - I'm particularly frightened of the
bit math, I think it's the formula here:
http://www.research.att.com/~njas/sequences/A033622) is at the bottom of this
post, if you want to have a look.
Other sorts that could be added:
- Gnome sort, http://en.wikipedia.org/wiki/Gnome_sort
- Introspective sort: http://en.wikipedia.org/wiki/Introsort
- Three-way merge sort:
http://www.cs.queensu.ca/home/cisc121/2002w/assignments/assn8/solution/MergeThreeWay.java
- Counting sort, http://en.wikipedia.org/wiki/Counting_sort
- Bucket sort, http://en.wikipedia.org/wiki/Bucket_sort
- Pigeonhole sort, http://en.wikipedia.org/wiki/Pigeonhole_sort
I didn't look too much at your radix sorts, they might have a lot in common with
the last three of the above.
Also, your "buffered insertion sort" is perhaps more commonly known as "binary
insertion sort", since you mention it uses binary search.
--
// TYPE is just a template parameter
// comp defaults to a simple less-than comparison
// the inout is probably unnecessary, I wouldn't have known that back then
void shellSort(inout TYPE[] a, size_t beg = 0, size_t end = 0, cmpType cmp =
null) {
if (cmp is null) cmp = this.comp;
if (end == 0) end = a.length;
// can't figure out how to get it working for arbitrary beg and end, hence kept
separate here
void actualShellSort(inout TYPE[] ra) {
size_t h = 1, hh = 1;
while (h < ra.length) {
// magic numbers from formula:
//
http://www.research.att.com/cgi-bin/access.cgi/as/njas/sequences/eisA.cgi?Anum=A033622
if (hh % 2)
h = 8 << hh - 6 << ((hh + 1) / 2) + 1;
else
h = 9 << hh - 9 << ( hh / 2) + 1;
++hh;
}
while (h > 0) {
// for each set, of which there are h
for (size_t i = h - 1; i < ra.length; ++i) {
// pick last element in set
TYPE tmp = ra[i];
size_t j = i;
// compare tmp to the one before it in the set
// if they are not in order, continue this loop, moving
// elements back to make room for tmp
for (; j >= h && !cmp(tmp, ra[j-h]); j -= h)
ra[j] = ra[j - h];
ra[j] = tmp;
}
// all h-sets sorted
h /= 3;
}
}
if (!smartToSort(a, beg, end, cmp)) return;
TYPE[] foo = a[beg..end];
foo.actualShellSort();
}
// other relevant stuff to get the above to compile (not tested, though):
bit smartToSort(inout TYPE[] a, size_t beg, size_t end, cmpType cmp) {
if (end <= beg)
return false;
if (end - beg > a.length)
return false;
if (end - beg < 2)
return false;
else if (end - beg == 2) {
if (!cmp(a[end - 1], a[beg]))
a.swap(beg, end - 1);
return false;
}
return true;
}
alias int function(in TYPE left, in TYPE right) cmpType;
Deewiant Wrote:
<snip>
Your comb sort uses a shrink factor of 0.7. Comb sort 11 uses
1 / (1 - 1 / (e ^ phi)), where e is Napier's constant and phi
the golden ratio. This leads to possible optimization in the
algorithm.
Hang on - isn't 0.7 a grow factor, rather than a shrink factor?
<snip>
You're missing shell sort:
http://en.wikipedia.org/wiki/Shell_sort
Indeed, that could be implemented as well.
I noticed there:
"For instance, if a list was 5-sorted and then 3-sorted, the
list is now not only 3-sorted, but both 5- and 3-sorted."
Guess I should read up on the proof sometime.
<snip>
Other sorts that could be added:
- Gnome sort, http://en.wikipedia.org/wiki/Gnome_sort
Not sure if this one is worth it....
- Introspective sort: http://en.wikipedia.org/wiki/Introsort
- Three-way merge sort:
http://www.cs.queensu.ca/home/cisc121/2002w/assignments/assn8/solution/MergeThreeWay.java
Certainly worth considering.
- Counting sort, http://en.wikipedia.org/wiki/Counting_sort
- Bucket sort, http://en.wikipedia.org/wiki/Bucket_sort
- Pigeonhole sort,
http://en.wikipedia.org/wiki/Pigeonhole_sort
It took me a moment to figure the difference between these three. But I would
need to reduce the range of the random sequences a lot before any of them will
be remotely feasible.
Stewart.
Stewart Gordon wrote:
Deewiant Wrote:
Your comb sort uses a shrink factor of 0.7. Comb sort 11 uses
1 / (1 - 1 / (e ^ phi)), where e is Napier's constant and phi
the golden ratio. This leads to possible optimization in the
algorithm.
Hang on - isn't 0.7 a grow factor, rather than a shrink factor?
I took a more detailed (but still quite cursory) look at your code, and it seems
you do things differently from the "traditional" method. I'm not sure if that's
even equivalent to the normal comb sort.
But yes, since 0.7 is less than 1, I suppose you could call it a grow factor. It
still shrinks the size of the gap you're looking at, though. Up to you, but all
references to comb sort I've seen talk about shrink factors. :-)
Other sorts that could be added:
- Gnome sort, http://en.wikipedia.org/wiki/Gnome_sort
Not sure if this one is worth it....
It's only a few lines of code, so there's not much to be worth. And at least
it's faster than bubble sort. ;-)
--
Remove ".doesnotlike.spam" from the mail address.
Deewiant Wrote:
<snip>
I took a more detailed (but still quite cursory) look at your
code, and it seems you do things differently from the
"traditional" method. I'm not sure if that's even equivalent
to the normal comb sort.
What do you consider to be the "normal comb sort"? OK, so the call to
lastSwapBubbleSort was my idea, but other than that, there doesn't seem to be
any difference at all, at least from this:
http://en.wikipedia.org/wiki/Comb_sort
<snip>
Other sorts that could be added:
- Gnome sort, http://en.wikipedia.org/wiki/Gnome_sort
Not sure if this one is worth it....
It's only a few lines of code, so there's not much to be
worth. And at least it's faster than bubble sort. ;-)
Looking at it on Wikipedia, it appears to be just a way of implementing
insertion sort without nesting loops. Strange - before, I'd understood it to
be something even slower - only one index variable, and so after moving an
element down you step forward one by one just to get back to where you moved it
from.
Stewart.
Stewart Gordon wrote:
Deewiant Wrote:
<snip>
I took a more detailed (but still quite cursory) look at your
code, and it seems you do things differently from the
"traditional" method. I'm not sure if that's even equivalent
to the normal comb sort.
What do you consider to be the "normal comb sort"? OK, so the call to
lastSwapBubbleSort was my idea, but other than that, there doesn't seem to be
any difference at all, at least from this:
http://en.wikipedia.org/wiki/Comb_sort
That's exactly what's confusing me. I suppose the reason you need that call is
that you're not counting the number of swaps made in the loop, and you're
stopping just when the gap is 1, not when swaps are also zero. Which I think
makes the algorithm a bit different, since the number of "combs" you do is
limited to once per gap size, plus the one bubble sort after you're done.
I added my comb sort 11 implementation to your benchmark, and it's noticeably
faster for random data than your comb sort, at least on my machine. So
something's different, to be sure. Even without the optimization related to gap
size 11, mine is faster.
Here's the source, changed to work with your benchmark, if you're interested:
void combSort11(uint[] a)
out {
debug (100) printList(a);
checkSorted(a);
} body {
// empirically found to be good at:
// http://world.std.com/~jdveale/combsort.htm
// 1 / (1 - 1 / (e ^ phi)), phi being golden ratio = (sqrt(5) + 1)/2
const real SHRINK_FACTOR =
1.2473309501039786540366528676643474234433714124826L;
bit swapped = false;
size_t gap = a.length;
do {
if (gap > 1) {
gap = cast(size_t)(cast(real) gap / SHRINK_FACTOR);
// hence, comb sort 11
if (gap == 9 || gap == 10)
gap = 11;
}
swapped = false;
for (size_t i = 0; i < a.length - gap; ++i) {
size_t j = i + gap;
if (a[i] > a[j]) {
swap(a[i], a[j]);
swapped = true;
}
}
} while (swapped || gap > 1);
}
--
Remove ".doesnotlike.spam" from the mail address.
Deewiant Wrote:
<snip>
That's exactly what's confusing me. I suppose the reason you
need that call is that you're not counting the number of swaps
made in the loop, and you're stopping just when the gap is 1,
not when swaps are also zero. Which I think makes the
algorithm a bit different, since the number of "combs" you do
is limited to once per gap size, plus the one bubble sort
after you're done.
It's more the other way round - calling bubble sort is the way I chose to
implement the passes of gap size 1. It saves having to maintain a swap flag
during the passes of gap > 1.
I added my comb sort 11 implementation to your benchmark, and
it's noticeably faster for random data than your comb sort, at
least on my machine. So something's different, to be sure.
Even without the optimization related to gap size 11, mine is
faster.
That appears to be because you've implemented the optimum shrink factor. If I
change my implementation to use it as well, it runs at about the same speed as
yours.
Stewart.
Stewart Gordon wrote:
Deewiant Wrote:
I added my comb sort 11 implementation to your benchmark, and
it's noticeably faster for random data than your comb sort, at
least on my machine. So something's different, to be sure.
Even without the optimization related to gap size 11, mine is
faster.
That appears to be because you've implemented the optimum shrink factor. If I
change my implementation to use it as well, it runs at about the same speed as
yours.
Ah, of course. That clears it up.
I ran the whole benchmark: for these cases, merge sort and comb sort 11 are the
only two to beat the built-in sort (taking the average of the results of all 9
runs). Interesting.
--
Remove ".doesnotlike.spam" from the mail address.
|
|