digitalmars.D

digitalmars.D - qsort

Mike (2/2) Jul 13 2005 has anyone written a qsort() for D? We need one that supports delegates.

Andrew Fedoniouk (91/93) Jul 13 2005 harmonia/tools.d

Uwe Salomon (5/8) Jul 13 2005 Did you ever test it with input data like this:

Andrew Fedoniouk (46/53) Jul 13 2005 I did.

Uwe Salomon (13/15) Jul 13 2005 You can also look into Indigo. List has a very good QuickSort, but it is...
Jan-Eric Duden (6/121) Jul 14 2005 Your Qsort has one draw-back its O(N^2) (worst-case). You can fix this,
Ben Hinkle (7/14) Jul 14 2005 Have you analyzed why it is faster? I'd be curious. My first guess from

Uwe Salomon (7/12) Jul 14 2005 That should be the reason. Function calls are quite expensive. I have
Andrew Fedoniouk (29/47) Jul 14 2005 It is all about function calls:

Mike <Mike_member pathlink.com> writes:

has anyone written a qsort() for D? We need one that supports delegates.

If not I will probably have to write one myself.

Jul 13 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Mike" <Mike_member pathlink.com> wrote in message 
news:db34a9$9hs$1 digitaldaemon.com...
 has anyone written a qsort() for D? We need one that supports delegates.

 If not I will probably have to write one myself.

harmonia/tools.d

// "smart" sort implementation, works 1.3 times faster than D builtin sort

template sort(T)
{
  void sort(T[] arr, int function(in T l, in T r) cmp)
  {
      const int quick_sort_threshold = 9;

      if(arr.length < 2) return;

      void swap_if_less( inout T t1, inout T t2 )
      {
        if( cmp(t1, t2) < 0 ) { T t = t1; t1 = t2; t2 = t; }
      }
      void swap( inout T t1, inout T t2 )
      {
        T t = t1; t1 = t2; t2 = t;
      }

      int  stack[80];
      int* top = stack;
      int  limit = arr.length;
      int  base = 0;

      for(;;)
      {
          int len = limit - base;

          int i;
          int j;
          int pivot;

          if(len > quick_sort_threshold)
          {
              // we use base + len/2 as the pivot
              pivot = base + len / 2;
              swap(arr[base], arr[pivot]);

              i = base + 1;
              j = limit - 1;

              // now ensure that *i <= *base <= *j
              swap_if_less(arr[j], arr[i]);
              swap_if_less(arr[base],arr[i]);
              swap_if_less(arr[j],arr[base]);

              for(;;)
              {
                  do i++; while( cmp(arr[i],arr[base]) < 0 );
                  do j--; while( cmp(arr[base],arr[j]) < 0 );
                  if( i > j )
                  {
                      break;
                  }
                  swap(arr[i], arr[j]);
              }

              swap(arr[base], arr[j]);

              // now, push the largest sub-array
              if(j - base > limit - i)
              {
                  top[0] = base;
                  top[1] = j;
                  base   = i;
              }
              else
              {
                  top[0] = i;
                  top[1] = limit;
                  limit  = j;
              }
              top += 2;
          }
          else
          {
              // the sub-array is small, perform insertion sort
              j = base;
              i = j + 1;

              for(; i < limit; j = i, i++)
              {
                  for(; cmp(arr[j + 1],arr[j]) < 0; j--)
                  {
                      swap(arr[j + 1],arr[j]);
                      if(j == base)
                          break;
                  }
              }
              if(top > stack)
              {
                  top  -= 2;
                  base  = top[0];
                  limit = top[1];
              }
              else
                  break;
          }
      }
  }
}

Jul 13 2005

"Uwe Salomon" <post uwesalomon.de> writes:

 harmonia/tools.d

 // "smart" sort implementation, works 1.3 times faster than D builtin  
 sort

Did you ever test it with input data like this:

1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6

That means a lot of sorted elements, and a bunch of unsorted afterwards.

Ciao
uwe

Jul 13 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Uwe Salomon" <post uwesalomon.de> wrote in message 
news:op.stvdh7qm6yjbe6 sandmann.maerchenwald.net...
 harmonia/tools.d

 // "smart" sort implementation, works 1.3 times faster than D builtin 
 sort

 Did you ever test it with input data like this:

 1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6

 That means a lot of sorted elements, and a bunch of unsorted afterwards.

I did.

'sort!' is mine, 'sort' is D's builtin.
less number - the better.

sequence from 1 to 1_000_000

  sort! - 203 ms
  sort  - 265 ms

takes 76% of D time

sequence from 1_000_000 to 1

  sort! - 125 ms
  sort  - 171 ms

takes 71% of D time
-------------------------------------------------------------
Theory is here:
http://www.seeingwithc.org/topic2html.html
-------------------------------------------------------------

ATTN!!!: As algorithms like this
belong to the whole humanity then my
implementation is completely free :-P

------------------------------------------------------------
//1, 2, 3, 4, 5, ..., 99998, 99999, 100000, 13, 12, 9, -1, 7, 6

int icmp(in int l, in int r) { return l - r; }

int main(char[][] args)
{
    ProcessTimesCounter counter = new ProcessTimesCounter();
    const uint size = 1000000;
    int[] data = new int[size];
    for(uint i = 0; i < size; ++i)
        data[i] = i;

   data[size - 6] = 13;
   data[size - 5] = 12;
   data[size - 4] = 9;
   data[size - 3] = -1;
   data[size - 2] = 7;
   data[size - 1] = 6;

    counter.start();
           sort!(int)(data,&icmp);
           //data.sort;
    counter.stop();
    ProcessTimesCounter.interval_type  ms1 = counter.milliseconds();
    writef("ms=%d\n",ms1);

    return 0;
}

Andrew.
http://terrainformatica.com

Jul 13 2005

"Uwe Salomon" <post uwesalomon.de> writes:

 // "smart" sort implementation, works 1.3 times faster than D builtin  
 sort

You can also look into Indigo. List has a very good QuickSort, but it is a  
little bit difficult to read because of the specialties of the List  
implementation, and it does not support special sorting functions. But i  
have included such a function for Vector, and after changing it to receive  
the comparison function as a template argument instead of a regular  
parameter, it needs 70% of the time of the D builtin sort. And it has the  
great advantage that it is linear for radix sort cases (i.e. a lot of  
elements, but only few different values: 3 3 4 5 2 2 2 2 3 4 7 5 3 4 5 2 2  
2 etc.).

But keep in mind: The library is LGPL'd, and as it is a derivative work,  
your program has to be conformant to the GPL.

Ciao
uwe

Jul 13 2005

Jan-Eric Duden <jeduden whisset.com> writes:

Andrew Fedoniouk wrote:
 "Mike" <Mike_member pathlink.com> wrote in message 
 news:db34a9$9hs$1 digitaldaemon.com...
 
has anyone written a qsort() for D? We need one that supports delegates.

If not I will probably have to write one myself.

 
 
 harmonia/tools.d
 
 // "smart" sort implementation, works 1.3 times faster than D builtin sort
 
 template sort(T)
 {
   void sort(T[] arr, int function(in T l, in T r) cmp)
   {
       const int quick_sort_threshold = 9;
 
       if(arr.length < 2) return;
 
       void swap_if_less( inout T t1, inout T t2 )
       {
         if( cmp(t1, t2) < 0 ) { T t = t1; t1 = t2; t2 = t; }
       }
       void swap( inout T t1, inout T t2 )
       {
         T t = t1; t1 = t2; t2 = t;
       }
 
       int  stack[80];
       int* top = stack;
       int  limit = arr.length;
       int  base = 0;
 
       for(;;)
       {
           int len = limit - base;
 
           int i;
           int j;
           int pivot;
 
           if(len > quick_sort_threshold)
           {
               // we use base + len/2 as the pivot
               pivot = base + len / 2;
               swap(arr[base], arr[pivot]);
 
               i = base + 1;
               j = limit - 1;
 
               // now ensure that *i <= *base <= *j
               swap_if_less(arr[j], arr[i]);
               swap_if_less(arr[base],arr[i]);
               swap_if_less(arr[j],arr[base]);
 
               for(;;)
               {
                   do i++; while( cmp(arr[i],arr[base]) < 0 );
                   do j--; while( cmp(arr[base],arr[j]) < 0 );
                   if( i > j )
                   {
                       break;
                   }
                   swap(arr[i], arr[j]);
               }
 
               swap(arr[base], arr[j]);
 
               // now, push the largest sub-array
               if(j - base > limit - i)
               {
                   top[0] = base;
                   top[1] = j;
                   base   = i;
               }
               else
               {
                   top[0] = i;
                   top[1] = limit;
                   limit  = j;
               }
               top += 2;
           }
           else
           {
               // the sub-array is small, perform insertion sort
               j = base;
               i = j + 1;
 
               for(; i < limit; j = i, i++)
               {
                   for(; cmp(arr[j + 1],arr[j]) < 0; j--)
                   {
                       swap(arr[j + 1],arr[j]);
                       if(j == base)
                           break;
                   }
               }
               if(top > stack)
               {
                   top  -= 2;
                   base  = top[0];
                   limit = top[1];
               }
               else
                   break;
           }
       }
   }
 }
 
 


Your Qsort has one draw-back its O(N^2) (worst-case). You can fix this, 
by switching to heap sort if your stack gets too large.

:)

Regards,
Jan-Eric

Jul 14 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:db3h2l$k2s$1 digitaldaemon.com...
 "Mike" <Mike_member pathlink.com> wrote in message 
 news:db34a9$9hs$1 digitaldaemon.com...
 has anyone written a qsort() for D? We need one that supports delegates.

 If not I will probably have to write one myself.

 harmonia/tools.d

 // "smart" sort implementation, works 1.3 times faster than D builtin sort

Have you analyzed why it is faster? I'd be curious. My first guess from 
looking at phobos/internal/qsort.d is that the templated version has more 
inlinable code. The algorithm looks the same to me at first glance. Two 
differences I notice are a quicker switch to insertion sort (9 vs 7) and 
inlined swap (dmd's calls the typeinfo's swap).

Jul 14 2005

"Uwe Salomon" <post uwesalomon.de> writes:

 Have you analyzed why it is faster? I'd be curious. My first guess from
 looking at phobos/internal/qsort.d is that the templated version has more
 inlinable code. The algorithm looks the same to me at first glance. Two
 differences I notice are a quicker switch to insertion sort (9 vs 7) and
 inlined swap (dmd's calls the typeinfo's swap).

That should be the reason. Function calls are quite expensive. I have  
switched my quicksort from receiving the comparison function as a normal  
parameter to receiving it as a template parameter, and that has increased  
its speed from 1.2 times Phobos to 0.7 times Phobos (the algorithm is  
quite different, too, but that is not relevant for this comparison).

Ciao
uwe

Jul 14 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:db5mhn$2lfo$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
 news:db3h2l$k2s$1 digitaldaemon.com...
 "Mike" <Mike_member pathlink.com> wrote in message 
 news:db34a9$9hs$1 digitaldaemon.com...
 has anyone written a qsort() for D? We need one that supports delegates.

 If not I will probably have to write one myself.

 harmonia/tools.d

 // "smart" sort implementation, works 1.3 times faster than D builtin 
 sort

 Have you analyzed why it is faster? I'd be curious. My first guess from 
 looking at phobos/internal/qsort.d is that the templated version has more 
 inlinable code. The algorithm looks the same to me at first glance. Two 
 differences I notice are a quicker switch to insertion sort (9 vs 7) and 
 inlined swap (dmd's calls the typeinfo's swap).

It is all about function calls:
I've removed cmp comparator function and result is even more spectacular:

'sort!' is mine,
'sort!-inline', is mine but with inlined cmp function,
'sort' is D's builtin.

less number - the better.

sequence from 1 to 1_000_000

  sort!-inline - 78 ms
  sort! - 203 ms
  sort  - 265 ms

'sort!'  takes 76% of D time
'sort!-inline'  takes 29% of D's builtin sort time

sequence from 1_000_000 to 1

  sort!-inline - 46 ms
  sort! - 125 ms
  sort  - 171 ms

'sort!'  takes 71% of D time
'sort!-inline'  26% of D time

--------------
template sort(T)
{
  void sort(T[] arr)
  {
      int cmp(in T l, in T r) { return l - r; }
      .....................
  }
}

Jul 14 2005

D Programming

C/C++ Programming

Other

digitalmars.D - qsort