www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - qsort

reply Mike <Mike_member pathlink.com> writes:
has anyone written a qsort() for D? We need one that supports delegates.

If not I will probably have to write one myself.
Jul 13 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Mike" <Mike_member pathlink.com> wrote in message 
news:db34a9$9hs$1 digitaldaemon.com...
 has anyone written a qsort() for D? We need one that supports delegates.

 If not I will probably have to write one myself.

harmonia/tools.d // "smart" sort implementation, works 1.3 times faster than D builtin sort template sort(T) { void sort(T[] arr, int function(in T l, in T r) cmp) { const int quick_sort_threshold = 9; if(arr.length < 2) return; void swap_if_less( inout T t1, inout T t2 ) { if( cmp(t1, t2) < 0 ) { T t = t1; t1 = t2; t2 = t; } } void swap( inout T t1, inout T t2 ) { T t = t1; t1 = t2; t2 = t; } int stack[80]; int* top = stack; int limit = arr.length; int base = 0; for(;;) { int len = limit - base; int i; int j; int pivot; if(len > quick_sort_threshold) { // we use base + len/2 as the pivot pivot = base + len / 2; swap(arr[base], arr[pivot]); i = base + 1; j = limit - 1; // now ensure that *i <= *base <= *j swap_if_less(arr[j], arr[i]); swap_if_less(arr[base],arr[i]); swap_if_less(arr[j],arr[base]); for(;;) { do i++; while( cmp(arr[i],arr[base]) < 0 ); do j--; while( cmp(arr[base],arr[j]) < 0 ); if( i > j ) { break; } swap(arr[i], arr[j]); } swap(arr[base], arr[j]); // now, push the largest sub-array if(j - base > limit - i) { top[0] = base; top[1] = j; base = i; } else { top[0] = i; top[1] = limit; limit = j; } top += 2; } else { // the sub-array is small, perform insertion sort j = base; i = j + 1; for(; i < limit; j = i, i++) { for(; cmp(arr[j + 1],arr[j]) < 0; j--) { swap(arr[j + 1],arr[j]); if(j == base) break; } } if(top > stack) { top -= 2; base = top[0]; limit = top[1]; } else break; } } } }
Jul 13 2005
next sibling parent reply "Uwe Salomon" <post uwesalomon.de> writes:
 harmonia/tools.d

 // "smart" sort implementation, works 1.3 times faster than D builtin  
 sort

Did you ever test it with input data like this: 1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6 That means a lot of sorted elements, and a bunch of unsorted afterwards. Ciao uwe
Jul 13 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Uwe Salomon" <post uwesalomon.de> wrote in message 
news:op.stvdh7qm6yjbe6 sandmann.maerchenwald.net...
 harmonia/tools.d

 // "smart" sort implementation, works 1.3 times faster than D builtin 
 sort

Did you ever test it with input data like this: 1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6 That means a lot of sorted elements, and a bunch of unsorted afterwards.

I did. 'sort!' is mine, 'sort' is D's builtin. less number - the better. sequence from 1 to 1_000_000 sort! - 203 ms sort - 265 ms takes 76% of D time sequence from 1_000_000 to 1 sort! - 125 ms sort - 171 ms takes 71% of D time ------------------------------------------------------------- Theory is here: http://www.seeingwithc.org/topic2html.html ------------------------------------------------------------- ATTN!!!: As algorithms like this belong to the whole humanity then my implementation is completely free :-P ------------------------------------------------------------ //1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6 int icmp(in int l, in int r) { return l - r; } int main(char[][] args) { ProcessTimesCounter counter = new ProcessTimesCounter(); const uint size = 1000000; int[] data = new int[size]; for(uint i = 0; i < size; ++i) data[i] = i; data[size - 6] = 13; data[size - 5] = 12; data[size - 4] = 9; data[size - 3] = -1; data[size - 2] = 7; data[size - 1] = 6; counter.start(); sort!(int)(data,&icmp); //data.sort; counter.stop(); ProcessTimesCounter.interval_type ms1 = counter.milliseconds(); writef("ms=%d\n",ms1); return 0; } Andrew. http://terrainformatica.com
Jul 13 2005
prev sibling next sibling parent "Uwe Salomon" <post uwesalomon.de> writes:
 // "smart" sort implementation, works 1.3 times faster than D builtin  
 sort

You can also look into Indigo. List has a very good QuickSort, but it is a little bit difficult to read because of the specialties of the List implementation, and it does not support special sorting functions. But i have included such a function for Vector, and after changing it to receive the comparison function as a template argument instead of a regular parameter, it needs 70% of the time of the D builtin sort. And it has the great advantage that it is linear for radix sort cases (i.e. a lot of elements, but only few different values: 3 3 4 5 2 2 2 2 3 4 7 5 3 4 5 2 2 2 etc.). But keep in mind: The library is LGPL'd, and as it is a derivative work, your program has to be conformant to the GPL. Ciao uwe
Jul 13 2005
prev sibling next sibling parent Jan-Eric Duden <jeduden whisset.com> writes:
Andrew Fedoniouk wrote:
 "Mike" <Mike_member pathlink.com> wrote in message 
 news:db34a9$9hs$1 digitaldaemon.com...
 
has anyone written a qsort() for D? We need one that supports delegates.

If not I will probably have to write one myself.

harmonia/tools.d // "smart" sort implementation, works 1.3 times faster than D builtin sort template sort(T) { void sort(T[] arr, int function(in T l, in T r) cmp) { const int quick_sort_threshold = 9; if(arr.length < 2) return; void swap_if_less( inout T t1, inout T t2 ) { if( cmp(t1, t2) < 0 ) { T t = t1; t1 = t2; t2 = t; } } void swap( inout T t1, inout T t2 ) { T t = t1; t1 = t2; t2 = t; } int stack[80]; int* top = stack; int limit = arr.length; int base = 0; for(;;) { int len = limit - base; int i; int j; int pivot; if(len > quick_sort_threshold) { // we use base + len/2 as the pivot pivot = base + len / 2; swap(arr[base], arr[pivot]); i = base + 1; j = limit - 1; // now ensure that *i <= *base <= *j swap_if_less(arr[j], arr[i]); swap_if_less(arr[base],arr[i]); swap_if_less(arr[j],arr[base]); for(;;) { do i++; while( cmp(arr[i],arr[base]) < 0 ); do j--; while( cmp(arr[base],arr[j]) < 0 ); if( i > j ) { break; } swap(arr[i], arr[j]); } swap(arr[base], arr[j]); // now, push the largest sub-array if(j - base > limit - i) { top[0] = base; top[1] = j; base = i; } else { top[0] = i; top[1] = limit; limit = j; } top += 2; } else { // the sub-array is small, perform insertion sort j = base; i = j + 1; for(; i < limit; j = i, i++) { for(; cmp(arr[j + 1],arr[j]) < 0; j--) { swap(arr[j + 1],arr[j]); if(j == base) break; } } if(top > stack) { top -= 2; base = top[0]; limit = top[1]; } else break; } } } }

Your Qsort has one draw-back its O(N^2) (worst-case). You can fix this, by switching to heap sort if your stack gets too large. :) Regards, Jan-Eric
Jul 14 2005
prev sibling parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:db3h2l$k2s$1 digitaldaemon.com...
 "Mike" <Mike_member pathlink.com> wrote in message 
 news:db34a9$9hs$1 digitaldaemon.com...
 has anyone written a qsort() for D? We need one that supports delegates.

 If not I will probably have to write one myself.

harmonia/tools.d // "smart" sort implementation, works 1.3 times faster than D builtin sort

Have you analyzed why it is faster? I'd be curious. My first guess from looking at phobos/internal/qsort.d is that the templated version has more inlinable code. The algorithm looks the same to me at first glance. Two differences I notice are a quicker switch to insertion sort (9 vs 7) and inlined swap (dmd's calls the typeinfo's swap).
Jul 14 2005
next sibling parent "Uwe Salomon" <post uwesalomon.de> writes:
 Have you analyzed why it is faster? I'd be curious. My first guess from
 looking at phobos/internal/qsort.d is that the templated version has more
 inlinable code. The algorithm looks the same to me at first glance. Two
 differences I notice are a quicker switch to insertion sort (9 vs 7) and
 inlined swap (dmd's calls the typeinfo's swap).

That should be the reason. Function calls are quite expensive. I have switched my quicksort from receiving the comparison function as a normal parameter to receiving it as a template parameter, and that has increased its speed from 1.2 times Phobos to 0.7 times Phobos (the algorithm is quite different, too, but that is not relevant for this comparison). Ciao uwe
Jul 14 2005
prev sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:db5mhn$2lfo$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
 news:db3h2l$k2s$1 digitaldaemon.com...
 "Mike" <Mike_member pathlink.com> wrote in message 
 news:db34a9$9hs$1 digitaldaemon.com...
 has anyone written a qsort() for D? We need one that supports delegates.

 If not I will probably have to write one myself.

harmonia/tools.d // "smart" sort implementation, works 1.3 times faster than D builtin sort

Have you analyzed why it is faster? I'd be curious. My first guess from looking at phobos/internal/qsort.d is that the templated version has more inlinable code. The algorithm looks the same to me at first glance. Two differences I notice are a quicker switch to insertion sort (9 vs 7) and inlined swap (dmd's calls the typeinfo's swap).

It is all about function calls: I've removed cmp comparator function and result is even more spectacular: 'sort!' is mine, 'sort!-inline', is mine but with inlined cmp function, 'sort' is D's builtin. less number - the better. sequence from 1 to 1_000_000 sort!-inline - 78 ms sort! - 203 ms sort - 265 ms 'sort!' takes 76% of D time 'sort!-inline' takes 29% of D's builtin sort time sequence from 1_000_000 to 1 sort!-inline - 46 ms sort! - 125 ms sort - 171 ms 'sort!' takes 71% of D time 'sort!-inline' 26% of D time -------------- template sort(T) { void sort(T[] arr) { int cmp(in T l, in T r) { return l - r; } ..................... } }
Jul 14 2005