www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Adding Radix Sort into Phobos

reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
I have a radix sort implementation at

https://github.com/nordlow/justd/blob/master/intsort.d#L92intsort.d

which beats Phobos own Quicksort by a factor 1.5 to 4 depending 
on element type (Intergral or FloatingPoint).

Anyone up for the job of adapting and merging it into Phobos' 
std.algorithm.sorting?

See also:

https://github.com/Xinok/XSort/issues/1#issuecomment-96321466
Apr 26 2015
next sibling parent reply Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 04/26/2015 09:16 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?=
<per.nordlow gmail.com>" wrote:
 I have a radix sort implementation at
 
 https://github.com/nordlow/justd/blob/master/intsort.d#L92intsort.d
 
 which beats Phobos own Quicksort by a factor 1.5 to 4 depending on
 element type (Intergral or FloatingPoint).
 
 Anyone up for the job of adapting and merging it into Phobos'
 std.algorithm.sorting?
Code seems to be pretty done (although there are lots of TODOs). Why not simply convert it into a pull request?
Apr 26 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Sunday, 26 April 2015 at 17:31:58 UTC, Martin Nowak wrote:
 On 04/26/2015 09:16 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?=
 <per.nordlow gmail.com>" wrote:
 I have a radix sort implementation at
 
 https://github.com/nordlow/justd/blob/master/intsort.d#L92intsort.d
 
 which beats Phobos own Quicksort by a factor 1.5 to 4 
 depending on
 element type (Intergral or FloatingPoint).
 
 Anyone up for the job of adapting and merging it into Phobos'
 std.algorithm.sorting?
Code seems to be pretty done (although there are lots of TODOs). Why not simply convert it into a pull request?
Ok, I'll try this. Does someone have any answers to these questions or should I wait until the prel. pull request is done?: •Figure out a way to template-parameterize radixSortImpl to make it work on aggregate element types when the comparison function can be expressed as an data-member-accessor of the aggregate. For example if ElementType is a struct S { int x, y; } and comparison function "a.x < b.x". •If possible implement a generic sorting algorithm that automatically selects the best suitable in-Place sorting algorithm based on types of the input parameters (range and comparsion/accessor function). This currently means isIntegral, float, double, and as above mentioned aggregates sorted on a member or combination of members whose bijection can fit into 8, 16, 32 or 64 bits.
Apr 27 2015
next sibling parent Martin Nowak <code+news.digitalmars dawg.eu> writes:
On 04/27/2015 09:52 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?=
<per.nordlow gmail.com>" wrote:
 Does someone have any answers to these questions or should I wait until
 the prel. pull request is done?:
 
 •Figure out a way to template-parameterize  radixSortImpl  to make it
 work on aggregate element types when the comparison function can be
 expressed as an data-member-accessor of the aggregate. For example if
 ElementType is a  struct S { int x, y; }  and comparison function "a.x <
 b.x".
I think an alias transform function that defaults to identity makes most sense. That's a bit similar to http://dlang.org/phobos/std_algorithm_sorting.html#schwartzSort. And it doesn't look like you support custom predicates.
 •If possible implement a generic sorting algorithm that automatically
 selects the best suitable in-Place sorting algorithm based on types of
 the input parameters (range and comparsion/accessor function). This
 currently means  isIntegral,  float,  double, and as above mentioned
 aggregates sorted on a member or combination of members whose bijection
 can fit into 8, 16, 32 or 64 bits.
That's a tough problem => treat it separately.
Apr 27 2015
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 4/27/15 12:52 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?= 
<per.nordlow gmail.com>" wrote:
 On Sunday, 26 April 2015 at 17:31:58 UTC, Martin Nowak wrote:
 On 04/26/2015 09:16 AM, "Per =?UTF-8?B?Tm9yZGzDtnci?=
 <per.nordlow gmail.com>" wrote:
 I have a radix sort implementation at

 https://github.com/nordlow/justd/blob/master/intsort.d#L92intsort.d

 which beats Phobos own Quicksort by a factor 1.5 to 4 depending on
 element type (Intergral or FloatingPoint).

 Anyone up for the job of adapting and merging it into Phobos'
 std.algorithm.sorting?
Code seems to be pretty done (although there are lots of TODOs). Why not simply convert it into a pull request?
Ok, I'll try this. Does someone have any answers to these questions or should I wait until the prel. pull request is done?: •Figure out a way to template-parameterize radixSortImpl to make it work on aggregate element types when the comparison function can be expressed as an data-member-accessor of the aggregate. For example if ElementType is a struct S { int x, y; } and comparison function "a.x < b.x". •If possible implement a generic sorting algorithm that automatically selects the best suitable in-Place sorting algorithm based on types of the input parameters (range and comparsion/accessor function). This currently means isIntegral, float, double, and as above mentioned aggregates sorted on a member or combination of members whose bijection can fit into 8, 16, 32 or 64 bits.
Yah, these are good angles/ideas. I'm curious, how come radixSort on long is better than quicksort? Conventional wisdom has it that quicksort is better for large-ish integral types. What data distributions did you test on? -- Andrei
Apr 27 2015
next sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 27 April 2015 at 20:54:36 UTC, Andrei Alexandrescu 
wrote:
 Yah, these are good angles/ideas. I'm curious, how come 
 radixSort on long is better than quicksort? Conventional wisdom 
 has it that quicksort is better for large-ish integral types. 
 What data distributions did you test on? -- Andrei
Randomized using uniform distributions. I'm using my own random generator module at https://github.com/nordlow/justd/blob/master/random_ex.d for randomization. I would like to see something similar in Phobos aswell :) Compiled with -release -unittest -noboundscheck byte n:1000000 sort:46679us radixSort:4326us Speed-Up:10.7903 byte n:1000000 sort:46679us radixSort:4293us Speed-Up:10.8733 ubyte n:1000000 sort:50003us radixSort:3980us Speed-Up:12.5636 ubyte n:1000000 sort:50003us radixSort:4384us Speed-Up:11.4058 short n:1000000 sort:80287us radixSort:14794us Speed-Up:5.427 short n:1000000 sort:80287us radixSort:14839us Speed-Up:5.41054 ushort n:1000000 sort:81647us radixSort:14302us Speed-Up:5.70878 ushort n:1000000 sort:81647us radixSort:14586us Speed-Up:5.59763 int n:1000000 sort:83754us radixSort:30590us Speed-Up:2.73795 int n:1000000 sort:83754us radixSort:30458us Speed-Up:2.74982 uint n:1000000 sort:82657us radixSort:31570us Speed-Up:2.61821 uint n:1000000 sort:82657us radixSort:30006us Speed-Up:2.75468 long n:1000000 sort:82028us radixSort:68955us Speed-Up:1.18959 long n:1000000 sort:82028us radixSort:68668us Speed-Up:1.19456 ulong n:1000000 sort:83375us radixSort:68099us Speed-Up:1.22432 ulong n:1000000 sort:83375us radixSort:67611us Speed-Up:1.23316 float n:1000000 sort:93002us radixSort:22612us Speed-Up:4.11295 float n:1000000 sort:93002us radixSort:26618us Speed-Up:3.49395 double n:1000000 sort:94129us radixSort:58590us Speed-Up:1.60657 double n:1000000 sort:94129us radixSort:64004us Speed-Up:1.47067
Apr 28 2015
parent reply "Andrea Fontana" <nospam example.com> writes:
On Tuesday, 28 April 2015 at 10:32:47 UTC, Per Nordlöw wrote:
 On Monday, 27 April 2015 at 20:54:36 UTC, Andrei Alexandrescu 
 wrote:
 Yah, these are good angles/ideas. I'm curious, how come 
 radixSort on long is better than quicksort? Conventional 
 wisdom has it that quicksort is better for large-ish integral 
 types. What data distributions did you test on? -- Andrei
Randomized using uniform distributions. I'm using my own random generator module at https://github.com/nordlow/justd/blob/master/random_ex.d for randomization. I would like to see something similar in Phobos aswell :) Compiled with -release -unittest -noboundscheck byte n:1000000 sort:46679us radixSort:4326us Speed-Up:10.7903 byte n:1000000 sort:46679us radixSort:4293us Speed-Up:10.8733 ubyte n:1000000 sort:50003us radixSort:3980us Speed-Up:12.5636 ubyte n:1000000 sort:50003us radixSort:4384us Speed-Up:11.4058 short n:1000000 sort:80287us radixSort:14794us Speed-Up:5.427 short n:1000000 sort:80287us radixSort:14839us Speed-Up:5.41054 ushort n:1000000 sort:81647us radixSort:14302us Speed-Up:5.70878 ushort n:1000000 sort:81647us radixSort:14586us Speed-Up:5.59763 int n:1000000 sort:83754us radixSort:30590us Speed-Up:2.73795 int n:1000000 sort:83754us radixSort:30458us Speed-Up:2.74982 uint n:1000000 sort:82657us radixSort:31570us Speed-Up:2.61821 uint n:1000000 sort:82657us radixSort:30006us Speed-Up:2.75468 long n:1000000 sort:82028us radixSort:68955us Speed-Up:1.18959 long n:1000000 sort:82028us radixSort:68668us Speed-Up:1.19456 ulong n:1000000 sort:83375us radixSort:68099us Speed-Up:1.22432 ulong n:1000000 sort:83375us radixSort:67611us Speed-Up:1.23316 float n:1000000 sort:93002us radixSort:22612us Speed-Up:4.11295 float n:1000000 sort:93002us radixSort:26618us Speed-Up:3.49395 double n:1000000 sort:94129us radixSort:58590us Speed-Up:1.60657 double n:1000000 sort:94129us radixSort:64004us Speed-Up:1.47067
Have you done some tests with sorted-data/almost sorted data/etc?
Apr 28 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Tuesday, 28 April 2015 at 15:29:11 UTC, Andrea Fontana wrote:
 Have you done some tests with sorted-data/almost sorted 
 data/etc?
Nope.
May 15 2015
parent "deadalnix" <deadalnix gmail.com> writes:
On Friday, 15 May 2015 at 12:31:26 UTC, Per Nordlöw wrote:
 On Tuesday, 28 April 2015 at 15:29:11 UTC, Andrea Fontana wrote:
 Have you done some tests with sorted-data/almost sorted 
 data/etc?
Nope.
Amongst other thing, when you do the first count pass, you can check if the dataset is sorted and early bailout if it is already.
May 15 2015
prev sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 27 April 2015 at 20:54:36 UTC, Andrei Alexandrescu 
wrote:
 Yah, these are good angles/ideas. I'm curious, how come 
 radixSort on long is better than quicksort? Conventional wisdom 
 has it that quicksort is better for large-ish integral types. 
 What data distributions did you test on? -- Andrei
I just realized that I should make radiSort nogc by using import std.container.array: Array; Array!Elem y; instead of Elem[] y; By import std.container.array: Array; import std.array: uninitializedArray; alias A = Array!Elem; auto y = uninitializedArray!A(n); fails as intsort.d(222,38): Error: template std.array.uninitializedArray cannot deduce function from argument types !(Array!byte)(immutable(ulong)), candidates are: How do I void allocate it then?
Apr 28 2015
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 26 April 2015 at 07:16:24 UTC, Per Nordlöw wrote:
 I have a radix sort implementation at

 https://github.com/nordlow/justd/blob/master/intsort.d#L92intsort.d

 which beats Phobos own Quicksort by a factor 1.5 to 4 depending 
 on element type (Intergral or FloatingPoint).

 Anyone up for the job of adapting and merging it into Phobos' 
 std.algorithm.sorting?

 See also:

 https://github.com/Xinok/XSort/issues/1#issuecomment-96321466
You can make it work with float using: https://github.com/deadalnix/Dsort/blob/master/sort/radix.d#L27-L41
Apr 27 2015
next sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 27 April 2015 at 21:14:18 UTC, deadalnix wrote:
 You can make it work with float using: 
 https://github.com/deadalnix/Dsort/blob/master/sort/radix.d#L27-L41
I already support float and double. /Per
Apr 28 2015
prev sibling parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Monday, 27 April 2015 at 21:14:18 UTC, deadalnix wrote:
 You can make it work with float using: 
 https://github.com/deadalnix/Dsort/blob/master/sort/radix.d#L27-L41
I don't support D's 80-bit real, though. Could we add support for bijectToUnsigned!real? See https://github.com/nordlow/justd/blob/master/intsort.d#L14
Apr 28 2015