digitalmars.D - Optimizing Java using D

Walter Bright (2/2) Jun 20 2014 http://www.reddit.com/r/programming/comments/28mp3m/the_magic_forest_pro...

bearophile (6/8) Jun 20 2014 Yes, Phobos unstable sort is not fast (years ago I have shown

Walter Bright (3/8) Jun 20 2014 If you could submit a bugzilla issue and include the faster version, tha...

bearophile (10/12) Jun 20 2014 My "faster" version is also doing lot of memory swaps (so it's

Russel Winder via Digitalmars-d (14/27) Jun 20 2014 Both Python (since that is the original) and Java now use modified

Xinok (5/8) Jun 20 2014 I get the feeling that, because he is utilizing UFCS, that it is

bearophile (4/8) Jun 20 2014 Please deprecate/kill the built-in sort before dmd 2.066 is out.

Brad Anderson (11/20) Jun 20 2014 Here's the issue for those curious about the state of this:

deadalnix (8/11) Jun 20 2014 2 lessons :

Brad Anderson (4/16) Jun 20 2014 He and Dicebot documented the issues he came up against pretty

bearophile (19/22) Jun 20 2014 This looks like a bad idea, int[] != ubyte[]:

safety0ff (7/10) Jun 20 2014 I just tested his code, std.algorithm.sort with

bearophile (9/16) Jun 20 2014 The attention span of Reddit is excessively small, so corrections

safety0ff (8/14) Jun 20 2014 I don't have a reddit account.

safety0ff (2/6) Jun 20 2014 For input of 500 500 500

logicchains (5/5) Jun 20 2014 Blog author here, I've added a note that D's sort matches the

Tofu Ninja (7/12) Jun 20 2014 Awesome! Glad to see D shown in a good light :) Great article btw.

logicchains (8/13) Jun 21 2014 Sounds like a good idea; the only reason I didn't make that
"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (4/18) Jun 21 2014 It's a bit more subtle: `someArray.sort()` calls std.algorithm's

Steven Schveighoffer (13/31) Jun 23 2014 of =

Nick Sabalausky (6/10) Jun 22 2014 If there's any lingering reasons why built-in sort might still need to

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> (2/18) Jun 23 2014 But then it wouldn't be callable with UFCS, no?
Steven Schveighoffer (5/17) Jun 23 2014 The only thing I can think of that won't work is sorting an array of cha...

bearophile (6/9) Jun 23 2014 The solution I have suggested is:

Steven Schveighoffer (9/14) Jun 23 2014 Is it just me, or does this seem unintuitive? I would think a stable sor...

bearophile (4/5) Jun 23 2014 Probably is a corner case. And the current stable sort is good.
Chris Cain (30/38) Jun 23 2014 Technically, you can prove that there exists some unstable sort

Andrea Fontana (5/8) Jun 23 2014 How can this be proven?

Andrea Fontana (2/10) Jun 23 2014
Chris Cain (20/24) Jun 23 2014 It's not really about the time complexity but the absolute time

Andrei Alexandrescu (17/22) Jun 23 2014 Yeah, it's really simple: the set of unstable sorting algorithms

Andrei Alexandrescu (3/3) Jun 23 2014 Corollary: the default sorting algorithm in std will always be unstable,...

Jonathan M Davis via Digitalmars-d (7/14) Jun 23 2014 Essentially, the sorting order of unstable sort is undefined, so it _can...
Chris Cain (20/23) Jun 24 2014 I'd also like to note that this whole discussion about

deadalnix (5/12) Jun 23 2014 Note that unstable version of timsort can perform better on some

Andrea Fontana (33/38) Jun 24 2014 I'm sorry if i'm going to say some stupid things, but I'm still

Chris Cain (48/87) Jun 24 2014 Of course. It couldn't take advantage of the looser definition to

Chris Cain (6/10) Jun 24 2014 By "relaxed" here, I was thinking in terms of not having to worry

Wanderer (12/31) Jul 03 2014 Nobody, never, measures sort algorithms by amount of swaps.

Andrei Alexandrescu (2/3) Jul 03 2014 That... is quite the claim. -- Andrei

Alix Pexton (10/13) Jul 03 2014 Most of the algorithm rankings I am aware of list both compares and

Wanderer (14/16) Jul 03 2014 I just thought, with the presence of structs in D, things are not

John Colvin (8/25) Jul 03 2014 Firstly, who uses 100kB structs? That's insanely large.

ed (2/3) Jul 03 2014 What if you're sorting a large database with large records?

Wanderer (10/14) Jul 03 2014 Databases don't sort their records physically. The main reason

Alix Pexton (7/13) Jul 03 2014 All in a single page is the runtime ideal. Immutable memory where every
Daniel Murphy (3/6) Jul 04 2014 Yes they do. http://en.wikipedia.org/wiki/Database_index#Clustered

Wanderer (14/17) Jul 05 2014 Ugh, and what happens in such hypothetical database if you update

Dmitry Olshansky (18/35) Jul 05 2014 Provision some extra space in each record. DBs do this all the time,

Wanderer (28/45) Jul 05 2014 Which means waste of space you complained about just below.

Dmitry Olshansky (39/87) Jul 05 2014 There are trade-offs. The world is not black and white and I don't

Wanderer (63/84) Jul 05 2014 This is not a trade-off at all. You suggested to keep database

Timon Gehr (4/21) Jul 05 2014 That's not 'merely holding a pointer' and it applies to class references...

Wanderer (8/11) Jul 06 2014 Eh?

Timon Gehr (12/23) Jul 06 2014 If we are talking about 'merely holding' them, sure.

Dmitry Olshansky (53/137) Jul 06 2014 Well they got to store a link somewhere, right? That's already 4 or 8

Wanderer (12/21) Jul 07 2014 I'm starting to think that D community should grow up few more

deadalnix (2/12) Jul 07 2014 Projection.

Daniel Murphy (6/7) Jul 05 2014 Awesome, now your sort order depends on processor endianness!

Timon Gehr (3/6) Jul 05 2014 ?

Xinok (14/15) Jul 03 2014 Maybe not in swaps, but I have seen sorting algorithms measured

Steven Schveighoffer (17/48) Jun 23 2014 Right. It's all dependent on whether the unstable sort can make the "rig...

Peter Alexander (4/9) Jun 20 2014 wat

safety0ff (6/17) Jun 20 2014 Many times my data is the result of mapping sorted/partially
Joseph Rushton Wakeling via Digitalmars-d (8/17) Jun 21 2014 I can't find the issue in bugzilla, but IIRC there was a problem with un...

Xinok (4/13) Jun 21 2014 That has since been fixed. I implemented heapsort as a fallback

Joseph Rushton Wakeling via Digitalmars-d (2/4) Jun 21 2014 Fantastic, thanks very much for that. :-)
Andrei Alexandrescu (3/17) Jun 22 2014 Wait, when did that happen? I think a better solution would be to take

Xinok (11/16) Jun 22 2014 https://github.com/D-Programming-Language/phobos/pull/1735

Andrei Alexandrescu (15/31) Jun 22 2014 Thanks. I wonder if it's possible to factor some of the heap sort code

Xinok (16/57) Jun 22 2014 That's a possibility, and BinaryHeap could benefit from it. My

Andrei Alexandrescu (7/31) Jun 22 2014 That would be me, terrible, I know :o). I'd file a task a while ago, now...

Walter Bright <newshound2 digitalmars.com> writes:

http://www.reddit.com/r/programming/comments/28mp3m/the_magic_forest_problem_revisited_optimising/

The article indicates there may be a performance problem with std.algorithm.sort

Jun 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 The article indicates there may be a performance problem with 
 std.algorithm.sort

Yes, Phobos unstable sort is not fast (years ago I have shown 
here a faster one). And the keySort of Phobos (my brain refuses 
to learn its spelling) is even slower.

Bye,
bearophile

Jun 20 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 6/20/2014 4:42 PM, bearophile wrote:
 Walter Bright:

 The article indicates there may be a performance problem with
std.algorithm.sort

 Yes, Phobos unstable sort is not fast (years ago I have shown here a faster
 one). And the keySort of Phobos (my brain refuses to learn its spelling) is
even
 slower.

If you could submit a bugzilla issue and include the faster version, that would 
be great!

Jun 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Walter Bright:

 If you could submit a bugzilla issue and include the faster 
 version, that would be great!

My "faster" version is also doing lot of memory swaps (so it's 
not faster in all cases), and it's not generic enough for the 
current ranges, and it doesn't switch to a safe O(n ln n) sort 
like a HeapSort when the ordering of the input data is bad, like 
a safe introsort of a standard library has to do.

So while I sometimes use it, I don't think it will be an 
improvement in the general case.

Bye,
bearophile

Jun 20 2014

Russel Winder via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Fri, 2014-06-20 at 23:59 +0000, bearophile via Digitalmars-d wrote:
 Walter Bright:
 
 If you could submit a bugzilla issue and include the faster 
 version, that would be great!

 
 My "faster" version is also doing lot of memory swaps (so it's 
 not faster in all cases), and it's not generic enough for the 
 current ranges, and it doesn't switch to a safe O(n ln n) sort 
 like a HeapSort when the ordering of the input data is bad, like 
 a safe introsort of a standard library has to do.
 
 So while I sometimes use it, I don't think it will be an 
 improvement in the general case.

Both Python (since that is the original) and Java now use modified
Timsort as the standard sort. Java also has some specialization but not
to the extent of using Radix Sort, which would be a good idea for pure
integer data.

Per Nordlöw is offering a Radix Sort implementation. I had promised to
do a thorough review of his code, but have so far failed to do more than
a surface review. I clearly must try harder.

-- 
Russel.
=============================================================================
Dr Russel Winder      t: +44 20 7585 2200   voip: sip:russel.winder ekiga.net
41 Buckmaster Road    m: +44 7770 465 077   xmpp: russel winder.org.uk
London SW11 1EN, UK   w: www.russel.org.uk  skype: russel_winder

Jun 20 2014

"Xinok" <xinok live.com> writes:

On Friday, 20 June 2014 at 23:34:32 UTC, Walter Bright wrote:
 http://www.reddit.com/r/programming/comments/28mp3m/the_magic_forest_problem_revisited_optimising/

 The article indicates there may be a performance problem with 
 std.algorithm.sort

I get the feeling that, because he is utilizing UFCS, that it is 
actually calling the built-in array sort and not 
std.algorithm.sort. The built-in sort is 3-4x slower, so that 
would explain why his naive quicksort implementation was faster.

Jun 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Xinok:

 I get the feeling that, because he is utilizing UFCS, that it 
 is actually calling the built-in array sort and not 
 std.algorithm.sort. The built-in sort is 3-4x slower, so that 
 would explain why his naive quicksort implementation was faster.

Please deprecate/kill the built-in sort before dmd 2.066 is out.

Bye,
bearophile

Jun 20 2014

"Brad Anderson" <eco gnuk.net> writes:

On Saturday, 21 June 2014 at 00:47:23 UTC, bearophile wrote:
 Xinok:

 I get the feeling that, because he is utilizing UFCS, that it 
 is actually calling the built-in array sort and not 
 std.algorithm.sort. The built-in sort is 3-4x slower, so that 
 would explain why his naive quicksort implementation was 
 faster.

 Please deprecate/kill the built-in sort before dmd 2.066 is out.

 Bye,
 bearophile

Here's the issue for those curious about the state of this:

https://issues.dlang.org/show_bug.cgi?id=10318

It seems like the deprecation was held up on Xinok being
uncertain on issues of purity. His concern with sorting arrays of
char isn't an problem as it shouldn't be done carelessly (as you
wrote in the issue).

I too am not sure how purity and delegates is supposed to work
but if you and others say it's fine to move forward we should
definitely get it deprecated. The built-in sort has haunted us
for too long (and is giving D a bad rap).

Jun 20 2014

"deadalnix" <deadalnix gmail.com> writes:

On Friday, 20 June 2014 at 23:34:32 UTC, Walter Bright wrote:
 http://www.reddit.com/r/programming/comments/28mp3m/the_magic_forest_problem_revisited_optimising/

 The article indicates there may be a performance problem with 
 std.algorithm.sort

2 lessons :

1. Kill the builtin sort function. He is probably calling that 
one instead of the standard lib one.
2. Our associative array are a joke. Inconsistent interface + 
slow.

H;S; Teoh worked on an alternative AA implementation, do we know 
where it stands ?

Jun 20 2014

"Brad Anderson" <eco gnuk.net> writes:

On Saturday, 21 June 2014 at 00:56:32 UTC, deadalnix wrote:
 On Friday, 20 June 2014 at 23:34:32 UTC, Walter Bright wrote:
 http://www.reddit.com/r/programming/comments/28mp3m/the_magic_forest_problem_revisited_optimising/

 The article indicates there may be a performance problem with 
 std.algorithm.sort

 2 lessons :

 1. Kill the builtin sort function. He is probably calling that 
 one instead of the standard lib one.
 2. Our associative array are a joke. Inconsistent interface + 
 slow.

 H;S; Teoh worked on an alternative AA implementation, do we 
 know where it stands ?

He and Dicebot documented the issues he came up against pretty
thoroughly here:

http://wiki.dlang.org/AA_Implementation_Issues

Jun 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Brad Anderson:

 He and Dicebot documented the issues he came up against pretty
 thoroughly here:

 http://wiki.dlang.org/AA_Implementation_Issues

This looks like a bad idea, int[] != ubyte[]:

string[int[]] intAA;
ubyte[] key = [1,2,3];
intAA[key] = "abc";     // implicit convert ubyte[] -> int[] via 
.idup
auto ptr2 = key in intAA; // no .idup, but compatible comparison 
of ubyte[] with int[]


I think switching away from the built-in AAs will surely break 
some D code. So the most important job is to define a good 
future-proof API for the new AAs. Then we can accept the first 
version of the AAs to be less flexible than the current AAs. 
Later some flexibility could be restored.

In theory a compiler switch could allow D users for two compiler 
releases to switch from the new to the legacy version of AAs, but 
looking at the complexity of this change I am not sure that's 
possible and a good idea.

Bye,
bearophile

Jun 20 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 20 June 2014 at 23:34:32 UTC, Walter Bright wrote:
 http://www.reddit.com/r/programming/comments/28mp3m/the_magic_forest_problem_revisited_optimising/

 The article indicates there may be a performance problem with 
 std.algorithm.sort

I just tested his code, std.algorithm.sort with 
SwapStrategy.stable is much faster than his quicksort.

std.algorithm.sort with SwapStrategy.unstable is considerably 
slower than his, whereas builtin sort is  abysmal.

I find that generally using SwapStrategy.stable performs better.
This isn't universal though as there are cases where it is slower.

Jun 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

safety0ff:

 I just tested his code, std.algorithm.sort with 
 SwapStrategy.stable is much faster than his quicksort.

 std.algorithm.sort with SwapStrategy.unstable is considerably 
 slower than his, whereas builtin sort is  abysmal.

 I find that generally using SwapStrategy.stable performs better.
 This isn't universal though as there are cases where it is 
 slower.

The attention span of Reddit is excessively small, so corrections 
often seem ignored, but if you have tested the D code, then I 
suggest you to also compare the timings of the C++ version(s) on 
the same computer and post the two (or more) timings in that 
Reddit thread. Decreasing a bit the bad advertising that is 
floating around about D doesn't hurt :-)

Bye,
bearophile

Jun 20 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Saturday, 21 June 2014 at 01:30:58 UTC, bearophile wrote:
 The attention span of Reddit is excessively small, so 
 corrections often seem ignored, but if you have tested the D 
 code, then I suggest you to also compare the timings of the C++ 
 version(s) on the same computer and post the two (or more) 
 timings in that Reddit thread. Decreasing a bit the bad 
 advertising that is floating around about D doesn't hurt :-)

I don't have a reddit account.

See also:
https://github.com/logicchains/MagicForest/pull/2#issuecomment-46741029

Both using stable sorts : D version runs in 3s and C++ version in 
3.2s under similar conditions.

Clang++ 3.4.1
LDC2 0.12.0 based on DMD v2.063.2 and LLVM 3.4.1

Jun 20 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Saturday, 21 June 2014 at 01:54:45 UTC, safety0ff wrote:
 See also:
 https://github.com/logicchains/MagicForest/pull/2#issuecomment-46741029

 Both using stable sorts : D version runs in 3s and C++ version 
 in 3.2s under similar conditions.

For input of 500 500 500

Jun 20 2014

"logicchains" <jonathan.t.barnard gmail.com> writes:

Blog author here, I've added a note that D's sort matches the 
speed of C++'s when the stable sort is used instead of the 
default unstable. I don't think there's anything wrong with D's 
unstable sort however, as the C++ version also performs worse 
when using std::sort (unstable) instead of std::stable_sort.

Jun 20 2014

"Tofu Ninja" <emmons0 purdue.edu> writes:

On Saturday, 21 June 2014 at 03:52:54 UTC, logicchains wrote:
 Blog author here, I've added a note that D's sort matches the 
 speed of C++'s when the stable sort is used instead of the 
 default unstable. I don't think there's anything wrong with D's 
 unstable sort however, as the C++ version also performs worse 
 when using std::sort (unstable) instead of std::stable_sort.

Awesome! Glad to see D shown in a good light :) Great article btw.

On another note, if someArray.sort() calls the built in sort even 
when std.algorithm is imported, then most definitely built in 
sort needs to go, I can see plenty of times where people would 
not even realize they were calling the wrong sort or even that 
there was an error in their code.

Jun 20 2014

"logicchains" <jonathan.t.barnard gmail.com> writes:

On Saturday, 21 June 2014 at 05:40:32 UTC, Tofu Ninja wrote:
 On another note, if someArray.sort() calls the built in sort 
 even when std.algorithm is imported, then most definitely built 
 in sort needs to go, I can see plenty of times where people 
 would not even realize they were calling the wrong sort or even 
 that there was an error in their code.

Sounds like a good idea; the only reason I didn't make that 
mistake myself was because I was calling sort on the output of 
partition rather than directly on the array.

Good news! I ran the D implementation using stable sort and the 
C++ implementation with input (2000,2000,2000), and while the C++ 
took 12m9s real, 10m23s user, the D only took 8m44s real, 7m41s 
user. So somehow the D's faster for really large input.

Jun 21 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Saturday, 21 June 2014 at 05:40:32 UTC, Tofu Ninja wrote:
 On Saturday, 21 June 2014 at 03:52:54 UTC, logicchains wrote:
 Blog author here, I've added a note that D's sort matches the 
 speed of C++'s when the stable sort is used instead of the 
 default unstable. I don't think there's anything wrong with 
 D's unstable sort however, as the C++ version also performs 
 worse when using std::sort (unstable) instead of 
 std::stable_sort.

 Awesome! Glad to see D shown in a good light :) Great article 
 btw.

 On another note, if someArray.sort() calls the built in sort 
 even when std.algorithm is imported, then most definitely built 
 in sort needs to go, I can see plenty of times where people 
 would not even realize they were calling the wrong sort or even 
 that there was an error in their code.

It's a bit more subtle: `someArray.sort()` calls std.algorithm's 
sort, while `someArray.sort` (without parens) calls the built-in 
sort. But yes, the latter needs to go.

Jun 21 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 21 Jun 2014 07:50:29 -0400, Marc Sch=C3=BCtz <schuetzm gmx.net> =
wrote:

 On Saturday, 21 June 2014 at 05:40:32 UTC, Tofu Ninja wrote:
 On Saturday, 21 June 2014 at 03:52:54 UTC, logicchains wrote:
 Blog author here, I've added a note that D's sort matches the speed =



of  =

 C++'s when the stable sort is used instead of the default unstable. =



I  =

 don't think there's anything wrong with D's unstable sort however, a=



s  =

 the C++ version also performs worse when using std::sort (unstable) =



 =

 instead of std::stable_sort.

 Awesome! Glad to see D shown in a good light :) Great article btw.

 On another note, if someArray.sort() calls the built in sort even whe=


n  =

 std.algorithm is imported, then most definitely built in sort needs t=


o  =

 go, I can see plenty of times where people would not even realize the=


y  =

 were calling the wrong sort or even that there was an error in their =


 =

 code.

 It's a bit more subtle: `someArray.sort()` calls std.algorithm's sort,=

  =

 while `someArray.sort` (without parens) calls the built-in sort. But  =

 yes, the latter needs to go.

Indeed. It's like a rotten easter egg. Needs to be killed with fire.

-Steve

Jun 23 2014

Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:

On 6/21/2014 1:40 AM, Tofu Ninja wrote:
 On another note, if someArray.sort() calls the built in sort even when
 std.algorithm is imported, then most definitely built in sort needs to
 go, I can see plenty of times where people would not even realize they
 were calling the wrong sort or even that there was an error in their code.

If there's any lingering reasons why built-in sort might still need to 
exist (I wouldn't know), then maybe it could get moved into object.d 
(perhaps by internally being renamed ex. __sort, with a "sort" wrapper 
in object.d?). That way it should at least trigger a conflict with 
std.algorithm.sort instead of shadowing it.

Jun 22 2014

"Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:

On Sunday, 22 June 2014 at 17:34:15 UTC, Nick Sabalausky wrote:
 On 6/21/2014 1:40 AM, Tofu Ninja wrote:
 On another note, if someArray.sort() calls the built in sort 
 even when
 std.algorithm is imported, then most definitely built in sort 
 needs to
 go, I can see plenty of times where people would not even 
 realize they
 were calling the wrong sort or even that there was an error in 
 their code.

 If there's any lingering reasons why built-in sort might still 
 need to exist (I wouldn't know), then maybe it could get moved 
 into object.d (perhaps by internally being renamed ex. __sort, 
 with a "sort" wrapper in object.d?). That way it should at 
 least trigger a conflict with std.algorithm.sort instead of 
 shadowing it.

But then it wouldn't be callable with UFCS, no?

Jun 23 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 22 Jun 2014 13:33:55 -0400, Nick Sabalausky  
<SeeWebsiteToContactMe semitwist.com> wrote:

 On 6/21/2014 1:40 AM, Tofu Ninja wrote:
 On another note, if someArray.sort() calls the built in sort even when
 std.algorithm is imported, then most definitely built in sort needs to
 go, I can see plenty of times where people would not even realize they
 were calling the wrong sort or even that there was an error in their  
 code.

 If there's any lingering reasons why built-in sort might still need to  
 exist (I wouldn't know), then maybe it could get moved into object.d  
 (perhaps by internally being renamed ex. __sort, with a "sort" wrapper  
 in object.d?). That way it should at least trigger a conflict with  
 std.algorithm.sort instead of shadowing it.

The only thing I can think of that won't work is sorting an array of char  
or wchar, which std.algorithm.sort will not do (right?).

-Steve

Jun 23 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 The only thing I can think of that won't work is sorting an 
 array of char or wchar, which std.algorithm.sort will not do 
 (right?).

The solution I have suggested is:

myString.representation.sort().release.unrepresentation

(Where unrepresentation  is not yet present in Phobos).

Bye,
bearophile

Jun 23 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 20 Jun 2014 23:52:52 -0400, logicchains  
<jonathan.t.barnard gmail.com> wrote:

 Blog author here, I've added a note that D's sort matches the speed of  
 C++'s when the stable sort is used instead of the default unstable. I  
 don't think there's anything wrong with D's unstable sort however, as  
 the C++ version also performs worse when using std::sort (unstable)  
 instead of std::stable_sort.

Is it just me, or does this seem unintuitive? I would think a stable sort  
requires extra care, i.e. extra time, to ensure stability.

Do we need an unstable sort then? Or is this a corner case? I am fully  
ignorant on these advanced sorting routines and how they work. The  
Quicksort-based sort routines are like black magic to me, my knowledge  
stops at merge sort :)

-Steve

Jun 23 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Steven Schveighoffer:

 Do we need an unstable sort then? Or is this a corner case?

Probably is a corner case. And the current stable sort is good.

Bye,
bearophile

Jun 23 2014

"Chris Cain" <zshazz gmail.com> writes:

On Monday, 23 June 2014 at 15:19:15 UTC, Steven Schveighoffer 
wrote:
 Is it just me, or does this seem unintuitive? I would think a 
 stable sort requires extra care, i.e. extra time, to ensure 
 stability.

 Do we need an unstable sort then? Or is this a corner case? I 
 am fully ignorant on these advanced sorting routines and how 
 they work. The Quicksort-based sort routines are like black 
 magic to me, my knowledge stops at merge sort :)

 -Steve

Technically, you can prove that there exists some unstable sort 
that is always faster than a stable sort. Andrei showed me this 
once:

[2,2,2,2,1]

With unstable sort, the 1 and first 2 could be swapped, but with 
stable sort, the 1 must swap with each 2 until it settles into 
the 0th position.

That said, I've never seen the unstable sort that's always faster 
than any stable sort and since it doesn't appear in any of the 
languages I know of, I've got to guess that no one else has 
discovered it either.

That said, I've seen some cases where the unstable sort is faster 
(particularly in cases where the ordering of the elements is 
nearly completely random), so it's not as if unstable sort is 
currently always worse. It's just in many real-world scenarios it 
ends up being worse. Furthermore, our stable sort (an 
implementation of Timsort) is impossible to really use 
"incorrectly" ... even using it very "stupidly" (as an example, 
append an element on the end of a sorted array and "insert" it by 
calling stable sort) will result in reasonable runtimes and do 
what the programmer expects of it. It makes writing simple quick 
scripts and prototypes absurdly easy and still very efficient. 
Plus maintaining the current order of the elements is really nice 
in general.

Personally, I'm still in the camp of suggesting that our default 
should be stable sort and if you're certain your use case will 
benefit from the extra speed of unstable sort, you should specify 
it. But I don't think that's going to change.

Jun 23 2014

"Andrea Fontana" <nospam example.com> writes:

How can this be proven?
Is it valid only for "swap-based" sorting algorithms?

For example, radix sort is stable and its complexity is O(kn). Is 
there a faster unstable sort?

On Monday, 23 June 2014 at 15:38:25 UTC, Chris Cain wrote:
 Technically, you can prove that there exists some unstable sort 
 that is always faster than a stable sort. Andrei showed me this 
 once:

Jun 23 2014

"Andrea Fontana" <nospam example.com> writes:

Whoops, "comparison based"

On Monday, 23 June 2014 at 15:57:12 UTC, Andrea Fontana wrote:
 How can this be proven?
 Is it valid only for "swap-based" sorting algorithms?

 For example, radix sort is stable and its complexity is O(kn). 
 Is there a faster unstable sort?

 On Monday, 23 June 2014 at 15:38:25 UTC, Chris Cain wrote:
 Technically, you can prove that there exists some unstable 
 sort that is always faster than a stable sort. Andrei showed 
 me this once:

Jun 23 2014

"Chris Cain" <zshazz gmail.com> writes:

On Monday, 23 June 2014 at 15:57:12 UTC, Andrea Fontana wrote:
 How can this be proven?
 Is it valid only for "swap-based" sorting algorithms?

 For example, radix sort is stable and its complexity is O(kn). 
 Is there a faster unstable sort?

It's not really about the time complexity but the absolute time 
it must take. But I showed the example that shows that the fact 
that any stable sort must do extra work:

[2,2,2,2,1]

An unstable sort may swap the first 2 and the 1 whereas a stable 
sort must, by definition, maintain the relative positioning of 
the 2s, which means it must move all of the 2s upwards. Think of 
a "magical" sort that knows exactly the perfect swaps to do in 
order to sort (that is, "magical" sort's complexity is O(n)). In 
the case above, "magical unstable sort" does 1 swap and "magical 
stable sort" must do 4 (if it doesn't, it's not stable). This 
shows that stable sorts must strictly be equal to or slower than 
unstable sort (and an unstable sort could sometimes do better 
than a stable sort).

Or, to look at it another way, all "stable" sorts satisfy the 
qualities needed by an "unstable" sort, but not all "unstable" 
sorts satisfy the qualities of a "stable" sort. The minimum 
number of swaps/operations to create a sorted array won't 
necessarily maintain the stability property of the elements.

Jun 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/23/14, 9:33 AM, Chris Cain wrote:
 Or, to look at it another way, all "stable" sorts satisfy the qualities
 needed by an "unstable" sort, but not all "unstable" sorts satisfy the
 qualities of a "stable" sort. The minimum number of swaps/operations to
 create a sorted array won't necessarily maintain the stability property
 of the elements.

Yeah, it's really simple: the set of unstable sorting algorithms 
includes the set of stable sorting algorithms, so unstable sorting is 
vacuously "at least as good or better".

I'm baffled by the recurrent assertion that stable sort algorithms are 
somehow faster than unstable sort algorithms.

First, comparing entire sets of algorithms on unspecified input 
statistics is tenuous. A comparison would be between specific algorithms 
on specific data statistics.

Second, there are known reasons and setups where Timsort and derivatives 
fare better than classic quicksort-based implementations, and 
generalizing them into some magic "stable sort is just better" umbrella 
argument is just odd. Timsort works well on data with large continuous 
sorted runs, and that's an important case but one that a quicksort-based 
approach could and should avail itself of as well.

Third, again, anything stable is also unstable :o).


Andrei

Jun 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Corollary: the default sorting algorithm in std will always be unstable, 
even if for certain data types and inputs it will produce stable 
sorting. -- Andrei

Jun 23 2014

Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:

 Sent: Monday, June 23, 2014 at 10:26 AM
 From: "Andrei Alexandrescu via Digitalmars-d" <digitalmars-d puremagic.com>
 To: digitalmars-d puremagic.com
 Subject: Re: Optimizing Java using D

 Corollary: the default sorting algorithm in std will always be unstable, 
 even if for certain data types and inputs it will produce stable 
 sorting. -- Andrei

Essentially, the sorting order of unstable sort is undefined, so it _can_
produce stable sorting, but it's not guaranteed to. It would be kind of silly
to try and make it so that unstable sort was _guaranteed_ to not produce
unstable sorting. We _technically_ could write an algorithm that was always
stable but named unstableSort without violating what unstable sort is. It
would just be silly.

- Jonathan M Davis

Jun 23 2014

"Chris Cain" <zshazz gmail.com> writes:

On Monday, 23 June 2014 at 17:26:47 UTC, Andrei Alexandrescu 
wrote:
 Corollary: the default sorting algorithm in std will always be 
 unstable, even if for certain data types and inputs it will 
 produce stable sorting. -- Andrei

I'd also like to note that this whole discussion about 
restrictions and such can also be shown as an argument for why 
stable sort should be the default. That is, we can prove that 
with the restriction of stability that if an unstable sort will 
give you suitable results then a stable sort will also give you 
suitable results, but the reverse is not true.

That is, if you don't know whether relaxing the stability 
requirement of the sort will give you "wrong answers" on some 
inputs of your programs, sorting using a stable sort is the 
correct choice. You need to have a reason to relax that 
requirement before doing it.

Speed is an OK argument for choosing unstable sort as default, 
but I think stronger guarantees that it will produce a correct 
output for a general problem is a better argument. Relaxation of 
guarantees should be an opt-in decision by a programmer only when 
they know the difference doesn't matter to the problem. This sort 
of logic is consistent with decisions made in the past about the 
direction of D.

Jun 24 2014

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 23 June 2014 at 17:25:13 UTC, Andrei Alexandrescu 
wrote:
 Second, there are known reasons and setups where Timsort and 
 derivatives fare better than classic quicksort-based 
 implementations, and generalizing them into some magic "stable 
 sort is just better" umbrella argument is just odd. Timsort 
 works well on data with large continuous sorted runs, and 
 that's an important case but one that a quicksort-based 
 approach could and should avail itself of as well.

Note that unstable version of timsort can perform better on some 
input, as you can use descendant runs (when only strictly 
descendent are usable in the stable version).

Jun 23 2014

"Andrea Fontana" <nospam example.com> writes:

On Monday, 23 June 2014 at 16:33:13 UTC, Chris Cain wrote:
 It's not really about the time complexity but the absolute time 
 it must take. But I showed the example that shows that the fact 
 that any stable sort must do extra work:

 [2,2,2,2,1]
 [...]

I'm sorry if i'm going to say some stupid things, but I'm still 
not sure about this.

You said: "you can prove that there exists some unstable sort 
that is always faster than a stable sort"

First of all: stable sort are also unstable, ok, but I think you 
meant: "you can prove that there exists some unstable sort - that 
it isn't a stable sort too -  that is always faster than a stable 
sort", didn't you?

With your example you're showing that is this specific case is 
faster, you're not proving that is *always* faster (btw, i think 
you meant equal or faster).

In my opinion you're just proving that for *swap-based 
algorithms* number of swaps you need to unstable sorting is 
always *less or equals* to the number of swaps you need to stable 
sorting.

Anyway some sorting algorithms don't use comparison between 
elements so swap doesn't make any sense. Radix sort, for example, 
never compares two elements. It "directly" write the output 
array, so I don't think the number of swaps is the right metric. 
Counting sort doesn't do any comparison too, and it doesn't swap 
elements.

Moreover you're assuming it should work "in-place". Usually we 
need a sorted copy, so you can't just swap that elements, you 
need to copy the other elements too and then apply your swaps 
over copy. If element are complex you have to copy element to new 
array one-by-one.

Using your example a 
"magical-stable-sort-without-swaps-planning-algorithm" could 
guess that we just need to prepend last element before the first 
one: it is a stable sort. There's no swap at all. And we copy 
exactly n elements. (and it's faster than copy all n elements and 
then do the swap).

Jun 24 2014

"Chris Cain" <zshazz gmail.com> writes:

On Tuesday, 24 June 2014 at 09:22:24 UTC, Andrea Fontana wrote:
 On Monday, 23 June 2014 at 16:33:13 UTC, Chris Cain wrote:
 It's not really about the time complexity but the absolute 
 time it must take. But I showed the example that shows that 
 the fact that any stable sort must do extra work:

 [2,2,2,2,1]
 [...]

 I'm sorry if i'm going to say some stupid things, but I'm still 
 not sure about this.

 You said: "you can prove that there exists some unstable sort 
 that is always faster than a stable sort"

 First of all: stable sort are also unstable, ok, but I think 
 you meant: "you can prove that there exists some unstable sort 
 - that it isn't a stable sort too -  that is always faster than 
 a stable sort", didn't you?

Of course. It couldn't take advantage of the looser definition to 
be faster otherwise.

 With your example you're showing that is this specific case is 
 faster, you're not proving that is *always* faster (btw, i 
 think you meant equal or faster).

Yes, equal to or faster in any individual case. Since there 
exists one specific case that it must be faster, that means *in 
general* it is faster. I didn't mean "in every circumstance it's 
faster", just that unstable sorts will always be, in general, 
faster because they can do all the same things as a stable sort 
but they have fewer restrictions and can potentially perform 
"short-cuts" that are illegal with stable sorts.

Basically, there's no reason an unstable sort will ever be slower 
than a stable sort, but this at least "one specific case" shows 
the reverse is not true.

 In my opinion you're just proving that for *swap-based 
 algorithms* number of swaps you need to unstable sorting is 
 always *less or equals* to the number of swaps you need to 
 stable sorting.

 Anyway some sorting algorithms don't use comparison between 
 elements so swap doesn't make any sense. Radix sort, for 
 example, never compares two elements. It "directly" write the 
 output array, so I don't think the number of swaps is the right 
 metric. Counting sort doesn't do any comparison too, and it 
 doesn't swap elements.

OK, sure, but those are technically also unstable sorts, so it's 
not as if you've found a case where what I said is untrue. Plus 
you're ignoring the fact that those are specialized solutions 
with reduced restrictions of their own. They aren't general 
sorting algorithms that work perfectly in all situations.

Plus counting sort could be said to be strictly an unstable sort 
because it doesn't necessarily preserve the order of the elements 
(in fact, it removes the identity of each element entirely, but 
that's one of the reasons it's not a good "general sorting 
algorithm").

Let's say you use counting sort on a hand of cards and you want 
to sort based on the card values irrespective of their suit. I 
think at this point you know that it's impossible to preserve the 
stability of the cards with counting sort.

 Moreover you're assuming it should work "in-place". Usually we 
 need a sorted copy, so you can't just swap that elements, you 
 need to copy the other elements too and then apply your swaps 
 over copy. If element are complex you have to copy element to 
 new array one-by-one.

 Using your example a 
 "magical-stable-sort-without-swaps-planning-algorithm" could 
 guess that we just need to prepend last element before the 
 first one: it is a stable sort. There's no swap at all. And we 
 copy exactly n elements. (and it's faster than copy all n 
 elements and then do the swap).

Of course, that's yet more playing around with restrictions. Plus 
your proposed relaxed stable sort is also the same as relaxed 
unstable sort so your restriction just made them identically fast.

That said, what you have to understand is that this is showing 
that restrictions that you place on the data/storage/etc will 
make it so that you can only do a subset of all possible 
algorithms. stability is a restriction that makes it so that you 
can't do all the same things an unstable sort can do. Never does 
this mean that a stable sort can be faster, but it *can* mean 
that an unstable sort will be faster.

In the by-default restrictions placed on std.algorithm.sort, 
unstable sort is faster. The possibility of application/removal 
of other restrictions making them ultimately equivalent is 
irrelevant.

So if you're wondering if there possibly exists a set of 
conditions where stable sort and unstable sort are equal, then 
yes, there are. However, if you know nothing about the conditions 
or restrictions, it's a safe bet to say that unstable sort is 
equal to or faster than stable sort. Furthermore, if you know 
about the conditions/restrictions on std.algorithm.sort, you 
_know_ unstable sort can be made faster than stable sort.

Jun 24 2014

"Chris Cain" <zshazz gmail.com> writes:

On Tuesday, 24 June 2014 at 16:37:06 UTC, Chris Cain wrote:
 Of course, that's yet more playing around with restrictions. 
 Plus your proposed relaxed stable sort is also the same as 
 relaxed unstable sort so your restriction just made them 
 identically fast.

By "relaxed" here, I was thinking in terms of not having to worry 
about the restriction of memory bounds (since you're required to 
create new memory anyway). Technically it's viewable from both 
the perspective of a restriction and a relaxation of a normally 
natural restriction.

Jun 24 2014

"Wanderer" <no-reply no-reply.org> writes:

On Monday, 23 June 2014 at 16:33:13 UTC, Chris Cain wrote:
 It's not really about the time complexity but the absolute time 
 it must take. But I showed the example that shows that the fact 
 that any stable sort must do extra work:

 [2,2,2,2,1]

 An unstable sort may swap the first 2 and the 1 whereas a 
 stable sort must, by definition, maintain the relative 
 positioning of the 2s, which means it must move all of the 2s 
 upwards. Think of a "magical" sort that knows exactly the 
 perfect swaps to do in order to sort (that is, "magical" sort's 
 complexity is O(n)). In the case above, "magical unstable sort" 
 does 1 swap and "magical stable sort" must do 4 (if it doesn't, 
 it's not stable). This shows that stable sorts must strictly be 
 equal to or slower than unstable sort (and an unstable sort 
 could sometimes do better than a stable sort).

 Or, to look at it another way, all "stable" sorts satisfy the 
 qualities needed by an "unstable" sort, but not all "unstable" 
 sorts satisfy the qualities of a "stable" sort. The minimum 
 number of swaps/operations to create a sorted array won't 
 necessarily maintain the stability property of the elements.

Nobody, never, measures sort algorithms by amount of swaps.

Swapping two references is zero-cost operation; *comparing* two 
elements of an array is what's important. Because it requires 
comparison function calls, fields resolve/read, actual comparison 
math etc. That's why sort algorithms are always measured by 
amount of compares.

And both versions of sort - stable and unstable - for this 
particular example you demonstrated above - [2,2,2,2,1] - should 
involve all five elements in comparison. If your unstable sort 
only compares first and last elements, swaps them and returns, 
it's broken.

Jul 03 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 7/3/14, 12:29 AM, Wanderer wrote:
 Nobody, never, measures sort algorithms by amount of swaps.

That... is quite the claim. -- Andrei

Jul 03 2014

Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:

On 03/07/2014 9:13 AM, Andrei Alexandrescu wrote:
 On 7/3/14, 12:29 AM, Wanderer wrote:
 Nobody, never, measures sort algorithms by amount of swaps.

 That... is quite the claim. -- Andrei

Most of the algorithm rankings I am aware of list both compares and 
swaps, because which one has the biggest effect on computation depends 
on the data (not its the ordering, but the complexity of comparison) and 
how it is stored (all in a single page of memory vs across multiple 
networked disks vs in immutable memory such that each swap actually 
duplicate the whole dataset).

Saying that one is always more significant than the other is far too 
much of an oversimplification.

A...

Jul 03 2014

"Wanderer" <no-reply no-reply.org> writes:

On Thursday, 3 July 2014 at 11:30:57 UTC, Alix Pexton wrote:
 Saying that one is always more significant than the other is 
 far too much of an oversimplification.

I just thought, with the presence of structs in D, things are not 
that simple. Structs don't use references and their contents is 
located "right in place" if I understand this correctly.

In other words, if you have array of structs, each struct of 100K 
size, and array's size is 10,000 elements, calling sort on it 
would require shuffling 1 gigabyte of memory in the worst case, 
stopping the entire application. (Sorting the same array in Java 
would require shuffling only 40K of references and would finish 
instantly, because only reference complex types are allowed 
there.)

All I can say - yes, amount of swaps can be important in this 
case of structs, but it's terribly inefficient anyway and should 
be avoided IMHO.

Jul 03 2014

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Thursday, 3 July 2014 at 15:40:33 UTC, Wanderer wrote:
 On Thursday, 3 July 2014 at 11:30:57 UTC, Alix Pexton wrote:
 Saying that one is always more significant than the other is 
 far too much of an oversimplification.

 I just thought, with the presence of structs in D, things are 
 not that simple. Structs don't use references and their 
 contents is located "right in place" if I understand this 
 correctly.

 In other words, if you have array of structs, each struct of 
 100K size, and array's size is 10,000 elements, calling sort on 
 it would require shuffling 1 gigabyte of memory in the worst 
 case, stopping the entire application. (Sorting the same array 
 in Java would require shuffling only 40K of references and 
 would finish instantly, because only reference complex types 
 are allowed there.)

 All I can say - yes, amount of swaps can be important in this 
 case of structs, but it's terribly inefficient anyway and 
 should be avoided IMHO.

Firstly, who uses 100kB structs? That's insanely large.

Secondly, sometimes you need the data to be contiguous in memory 
after sorting, for performance or compatibility reasons.

The point where this is important is usually in a middle ground, 
things like a complex type (std.complex.Complex can be as large 
as 32 bytes IIRC) or a particle struct that holds position, 
velocity and acceleration vectors.

Jul 03 2014

"ed" <gmail gmail.com> writes:

On Thursday, 3 July 2014 at 07:29:42 UTC, Wanderer wrote:
 Nobody, never, measures sort algorithms by amount of swaps.

What if you're sorting a large database with large records?

Jul 03 2014

"Wanderer" <no-reply no-reply.org> writes:

On Thursday, 3 July 2014 at 10:13:20 UTC, ed wrote:
 What if you're sorting a large database with large records?

Databases don't sort their records physically. The main reason 
for that is that each record has many columns so there are many 
various possible sort orders.

Instead, databases maintain indices which allow you to retrieve 
the data in sorted order. That's a different story, not much in 
common with sorting an array/collection in memory.


On Thursday, 3 July 2014 at 11:30:57 UTC, Alix Pexton wrote:
 and how it is stored (all in a single page of memory vs across 
 multiple networked disks vs in immutable memory such that each 
 swap actually duplicate the whole dataset).

And how much of that wide area is actually covered by D's sort 
that was discussed here?

Jul 03 2014

Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:

On 03/07/2014 4:16 PM, Wanderer wrote:
 On Thursday, 3 July 2014 at 11:30:57 UTC, Alix Pexton wrote:
 and how it is stored (all in a single page of memory vs across
 multiple networked disks vs in immutable memory such that each swap
 actually duplicate the whole dataset).

 And how much of that wide area is actually covered by D's sort that was
 discussed here?

All in a single page is the runtime ideal. Immutable memory where every 
swap duplicates the whole dataset is the CTFE reality.

[But sorting at compile time is a really bad idea!]

Multiple networked disks was just the worst case I can think of, someone 
somewhere will be doing it, but maybe not in D.

A...

Jul 03 2014

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Wanderer"  wrote in message news:aroorrxjloihxtthknwt forum.dlang.org... 

 Databases don't sort their records physically. The main reason 
 for that is that each record has many columns so there are many 
 various possible sort orders.

Yes they do.  http://en.wikipedia.org/wiki/Database_index#Clustered

You can obviously only do that for one index.

Jul 04 2014

"Wanderer" <no-reply no-reply.org> writes:

On Friday, 4 July 2014 at 12:18:54 UTC, Daniel Murphy wrote:
 Yes they do.  
 http://en.wikipedia.org/wiki/Database_index#Clustered

 You can obviously only do that for one index.

Ugh, and what happens in such hypothetical database if you update 
its first row so it becomes 1 byte longer than before? Total 
reordering of the whole database just to maintain that index? 
That's very similar to the issue I posted above - wasteful 
shuffling of big amounts of memory only because D's structs don't 
support indirection by their nature, and you can't swap two 
elements without copying their whole contents.

John Colvin said above that structs can be useful, even together 
with sorting (and similar algorithms), if they are relatively 
small. This is true, but this design is not scalable. As soon as 
your program evolves and your struct grows with more and more 
fields, D's sort on it becomes slower and slower, proportionally 
to the struct's size. That's not the best behavior IMHO.

Jul 05 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

05-Jul-2014 18:02, Wanderer пишет:
 On Friday, 4 July 2014 at 12:18:54 UTC, Daniel Murphy wrote:
 Yes they do. http://en.wikipedia.org/wiki/Database_index#Clustered

 You can obviously only do that for one index.

 Ugh, and what happens in such hypothetical database if you update its
 first row so it becomes 1 byte longer than before?

Provision some extra space in each record. DBs do this all the time, 
regardless of layout.

 Total reordering of
 the whole database just to maintain that index? That's very similar to
 the issue I posted above - wasteful shuffling of big amounts of memory
 only because D's structs don't support indirection by their nature, and
 you can't swap two elements without copying their whole contents.

It's funny to see how an _ability_ to avoid indirection can be framed as 
a problem. Nothing prevents having an array of pointers to structs or 
just using classes if desired.

On the contrary sorting an array of pairs of integers in say Java is 
shamefully inefficient. No wonder packed objects (=value types)  are 
coming to Java, to address these sorts of need.

http://www.slideshare.net/mmitran/ibm-java-packed-objects-mmit-20121120

 John Colvin said above that structs can be useful, even together with
 sorting (and similar algorithms), if they are relatively small. This is
 true, but this design is not scalable. As soon as your program evolves
 and your struct grows with more and more fields, D's sort on it becomes
 slower and slower, proportionally to the struct's size. That's not the
 best behavior IMHO.

Ehm. One can always redesign struct layout to push stuff out through an 
indirection. And it's not like structs grow organically on their own ;)

In the same vane design of ref-to-instance is not scalable with millions 
of items scattered across the memory causing awful cache miss rate (as 
the indirection got to be crossed to compare things), not to mention the 
memory wasted per each object.

-- 
Dmitry Olshansky

Jul 05 2014

"Wanderer" <no-reply no-reply.org> writes:

On Saturday, 5 July 2014 at 14:20:33 UTC, Dmitry Olshansky wrote:
 Provision some extra space in each record. DBs do this all the 
 time, regardless of layout.

Which means waste of space you complained about just below.

Besides, you understand this is not a solution: one byte more 
than that reserve can provide, and the reordering should be done. 
Stuff like FAT was created exactly because linear storages are 
terribly inefficient unless they are read-only.

 It's funny to see how an _ability_ to avoid indirection can be 
 framed as a problem. Nothing prevents having an array of 
 pointers to structs or just using classes if desired.

Pointers to structs? Manually add indirection for sorting? That's 
reinventing the wheel. Indeed better (and simpler) using classes 
instead. They provide the same functionality and without using 
unsafe features.

 On the contrary sorting an array of pairs of integers in say 
 Java is shamefully inefficient. No wonder packed objects 
 (=value types)  are coming to Java, to address these sorts of 
 need.

For pair of integers, you can use long and sort an array of 
longs. But if your structure is any more complex (or has 
nontrivial comparison algorithm), one should really consider 
using classes first.

 Ehm. One can always redesign struct layout to push stuff out 
 through an indirection. And it's not like structs grow 
 organically on their own ;)

You mean structs with pointers? That violates the whole principle 
of structs IMHO. Ideally, when you assign/copy a struct, you 
assign/copy a *value*. Usage of pointers would add undue aliasing 
to the copy process, leading to bugs.

 In the same vane design of ref-to-instance is not scalable with 
 millions of items scattered across the memory causing awful 
 cache miss rate (as the indirection got to be crossed to 
 compare things), not to mention the memory wasted per each 
 object.

It depends. With classes, you have to read and write only small 
portions of memory - read only fields to compare, write only 
references to swap. Even with cache misses, this is still very 
small and efficient operations. With structs, you have to read 
and write much bigger regions of memory - to swap two items, you 
have to read both of them fully, then write both of them fully 
into new locations - which would make caching totally useless, 
whether structs are located sequentially or not. It's like 
copying a large file - the cache gets exhausted instantly, and 
doesn't speed things up at all.

Jul 05 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

05-Jul-2014 19:08, Wanderer пишет:
 On Saturday, 5 July 2014 at 14:20:33 UTC, Dmitry Olshansky wrote:
 Provision some extra space in each record. DBs do this all the time,
 regardless of layout.

 Which means waste of space you complained about just below.

There are trade-offs. The world is not black and white and I don't 
follow 'one rule everywhere'.

 Besides, you understand this is not a solution: one byte more than that
 reserve can provide, and the reordering should be done.

Which is what happens at a times, regardless of how DB was organized.

 Stuff like FAT
 was created exactly because linear storages are terribly inefficient
 unless they are read-only.

Frankly FAT is junk. But in any case file maintenance problem is all 
about balancing wasting some memory on gaps with the need defragment 
linked chains. There is no silver bullet you seem to imply.

 It's funny to see how an _ability_ to avoid indirection can be framed
 as a problem. Nothing prevents having an array of pointers to structs
 or just using classes if desired.

 Pointers to structs? Manually add indirection for sorting? That's
 reinventing the wheel. Indeed better (and simpler) using classes
 instead. They provide the same functionality and without using unsafe
 features.

Pointers are perfectly fine as long as there is no pointer arithmetic.
Also there is no problem just constructing a sorted index (e.g. there is 
function for that in Phobos) for the array of big structs, and voila - 
the same reference to the item, but the stuff is neatly packed in a 
contiguous array.

 On the contrary sorting an array of pairs of integers in say Java is
 shamefully inefficient. No wonder packed objects (=value types)  are
 coming to Java, to address these sorts of need.

 For pair of integers, you can use long and sort an array of longs.

Which is hopelessly admitting defeat. A pair may have non-trivial 
comparator. And a minor step beyond that such as a pair of double and 
integer and it fails completely.

 But
 if your structure is any more complex (or has nontrivial comparison
 algorithm), one should really consider using classes first.

And there Java-style indirection-happy classes "shine". For instance 
modeling any complex stuff with classes alone would lead to things like:
House--reference->Bedroom---reference-->Bed--reference-->Pillow

Which is incredible waste of memory and speed on something so simple.

 Ehm. One can always redesign struct layout to push stuff out through
 an indirection. And it's not like structs grow organically on their
 own ;)

 You mean structs with pointers? That violates the whole principle of
 structs IMHO.

What? But again it seems to be the general direction of you post - 
puristic one size fits all. The idea that there is true holy way, is 
very much province of OOP language proponents.

 Ideally, when you assign/copy a struct, you assign/copy a
 *value*. Usage of pointers would add undue aliasing to the copy process,
 leading to bugs.

Copy constructors or in D parlance postblits. There is also CoW and 
whatnot. In fact swap doesn't create a copy in D, it just does a bitwise 
swap in-place.

 In the same vane design of ref-to-instance is not scalable with
 millions of items scattered across the memory causing awful cache miss
 rate (as the indirection got to be crossed to compare things), not to
 mention the memory wasted per each object.

 It depends. With classes, you have to read and write only small portions
 of memory - read only fields to compare

Strictly more then with structs to be precise as long as it's just 
compare we talk about. A reference + fields vs just fields. Also the 
cost of extra indirection is very much non-negligible.

 write only references to swap.

That much is true.

 Even with cache misses, this is still very small and efficient
 operations.

Disagree - the moment a cache line is loaded it doesn't matter how much 
you read in this line. Even a tiny read missing a cache is paid in full.

 With structs, you have to read and write much bigger regions
 of memory - to swap two items, you have to read both of them fully, then
 write both of them fully into new locations -

I bet compare will load the whole thing into cache anyway, plus 
contiguous structs may fit into a single line. Sequential storage has 
advantages.

 which would make caching
 totally useless, whether structs are located sequentially or not.

How's that?

 It's
 like copying a large file - the cache gets exhausted instantly, and
 doesn't speed things up at all.

And the only thing worse then copying a large file is copying a large 
file fragmented in tiny splinters.

-- 
Dmitry Olshansky

Jul 05 2014

"Wanderer" <no-reply no-reply.org> writes:

On Saturday, 5 July 2014 at 16:03:17 UTC, Dmitry Olshansky wrote:
 There are trade-offs. The world is not black and white and I 
 don't follow 'one rule everywhere'.

This is not a trade-off at all. You suggested to keep database 
records linearly, with space gaps between records to support 
"tiny updates". Without serious updates, this is major waste of 
space. With them, your design won't save the day, because any gap 
will be eventually consumed and without fragmentation/reordering, 
the storage will fail.

FYI, nowaday popular databases aren't designed this way, exactly 
for the reasons I described above. All databases I worked with 
(MySQL, Oracle and Firebird) hold records in very compact way, 
without a single byte gaps, with ability of fragmentation, and 
without total physical ordering. So at least designers of these 
DBMS have agreed that your design is not practical.

 Pointers are perfectly fine as long as there is no pointer 
 arithmetic.

Wrong. Merely holding a pointer (i.e. a physical address) is 
unsafe already. Non-deep serialization, or any other 
"preservation" of such a struct and GC is unable to keep the 
track of pointers. GC moves the object around or deletes it, and 
you have a tiny black hole in your app.

 Which is hopelessly admitting defeat. A pair may have 
 non-trivial comparator. And a minor step beyond that such as a 
 pair of double and integer and it fails completely.

I said above: in any non-trivial case, use classes instead of 
overly-clever structures. And if you really, really need 
premature optimization, there is java.nio and buffers. Create 
ByteBuffer (direct one if you need super optimized solution) and 
treat slices of it as "structs". That's possible and easy to 
implement, but really not needed in practice because all you get 
is 0.1% of memory saving and no gain in speed.

 And there Java-style indirection-happy classes "shine". For 
 instance modeling any complex stuff with classes alone would 
 lead to things like:
 House--reference->Bedroom---reference-->Bed--reference-->Pillow

 Which is incredible waste of memory and speed on something so 
 simple.

That's not complex. That's very simple. It would become slightly 
more complex if all house parts implemented the same root 
interface and basic operations, like examining what's around and 
finding an object given by its path or properties, or throwing an 
extra pillow or two onto the same bed. All that is just a dream 
for structs.

 Copy constructors or in D parlance postblits. There is also CoW 
 and whatnot. In fact swap doesn't create a copy in D, it just 
 does a bitwise swap in-place.

And here we have a tough choice for structs with pulled subdata 
"for efficiency": either assignment operator copies that data too 
(making routines like sort to perform horribly slow), or they 
perform only shallow copy, causing undue sharing of data and 
violating the whole struct "value" philosophy.

 Disagree - the moment a cache line is loaded it doesn't matter 
 how much you read in this line. Even a tiny read missing a 
 cache is paid in full.

But the amount of these "missed reads" is low, so less amount of 
cache is invalidated. CPU cache is not a single page that gets 
invalidated as a whole, it's more like many small subpages, each 
of them is treaten individually.

If you're really into the low-level mumbo jumbo, here are two 
more aspects for you to consider.

1. Since indirection does not require the object contents to be 
moved around while sorting, these objects can be read into cache 
once and never invalidated later - unlike "right in place" 
structs which invalidate CPU cache each time a swap is performed. 
That is a huge cache win already.

2. Don't forget that new objects are created in eden space in 
Java - which is essentially the stack area. Very fast, compact, 
sequential memory. If your array fits in eden (and that's surely 
true for the forest problem this thread has started from), then 
allocating array of objects is essentially the same as allocating 
array of structs: object contents aren't "scattered in memory" 
but are located one-after-one without gaps between them. That 
greatly aids caching as well. The main difference between eden 
and "array of structs" is that the allocation of the former never 
fails (assuming there's enough memory), only gets slightly slower 
in the worst case, and allocation of the latter can fail because 
"stack overflow error" or "too much memory fragmentation oops", 
even if there's enough free memory.

 And the only thing worse then copying a large file is copying a 
 large file fragmented in tiny splinters.

You surely meant "reading" here, not "copying". Copying large 
fragmented file is as slow as copying large unfragmented file, 
because writing operations are the bottleneck here.

Jul 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/06/2014 05:19 AM, Wanderer wrote:
 On Saturday, 5 July 2014 at 16:03:17 UTC, Dmitry Olshansky wrote:
 ...
 Pointers are perfectly fine as long as there is no pointer arithmetic.

 Wrong. Merely holding a pointer (i.e. a physical address) is unsafe
 already. Non-deep serialization, or any other "preservation" of such a
 struct and GC is unable to keep the track of pointers. GC moves the
 object around or deletes it, and you have a tiny black hole in your app.
 ...

That's not 'merely holding a pointer' and it applies to class references 
just as much.

 ...

 If you're really into the low-level mumbo jumbo, here are two more
 aspects for you to consider.

 1. Since indirection does not require the object contents to be moved
 around while sorting, these objects can be read into cache once and
 never invalidated later  - unlike "right in place" structs which
 invalidate CPU cache each time a swap is performed. That is a huge cache
 win already.
 ...

This is just random nonsense.

Jul 05 2014

"Wanderer" <no-reply no-reply.org> writes:

On Sunday, 6 July 2014 at 04:20:21 UTC, Timon Gehr wrote:
 That's not 'merely holding a pointer' and it applies to class 
 references just as much.

Eh?

This is genius, you've just "proven" that references are as 
unsafe as pointers, and people who spent much time and efforts 
designing the whole reference semantics in order to replace 
pointers and achieve language and memory safety, are stupid. Well 
done.

 This is just random nonsense.

If you say so...

Jul 06 2014

Timon Gehr <timon.gehr gmx.ch> writes:

Please don't under-quote, thanks.

 Pointers are perfectly fine as long as there is no pointer arithmetic.

 Wrong. Merely holding a pointer (i.e. a physical address) is unsafe already.
 Non-deep serialization, or any other "preservation" of such a struct and GC
 is unable to keep the track of pointers. GC moves the object around or deletes
 it, and you have a tiny black hole in your app.



On 07/06/2014 01:45 PM, Wanderer wrote:
 On Sunday, 6 July 2014 at 04:20:21 UTC, Timon Gehr wrote:
 That's not 'merely holding a pointer' and it applies to class
 references just as much.

 [...] "proven" that references are as unsafe as
 pointers, [...]

If we are talking about 'merely holding' them, sure.

Otherwise, if one manages to serialize and deserialize a class reference 
non-deeply, it is obviously just as unsafe to dereference it again after 
deserialization as it would be in the case of a pointer.

 [...] reference semantics in order to replace pointers and achieve language
and memory safety [...]

To achieve safety, it is sufficient to only allow safe _operations_ on 
pointers, there's nothing particularly unsafe about pointers per se.


BTW, in case this discussion is still about settling whether counting 
the number of swaps of a sorting algorithm is relevant at all, then 
there is a much easier way: There are sorting algorithms where there are 
asymptotically more swaps than compares in the respective worst cases.

Jul 06 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

06-Jul-2014 07:19, Wanderer пишет:
 On Saturday, 5 July 2014 at 16:03:17 UTC, Dmitry Olshansky wrote:
 There are trade-offs. The world is not black and white and I don't
 follow 'one rule everywhere'.

 This is not a trade-off at all. You suggested to keep database records
 linearly, with space gaps between records to support "tiny updates".
 Without serious updates, this is major waste of space. With them, your
 design won't save the day, because any gap will be eventually consumed
 and without fragmentation/reordering, the storage will fail.

Strawman.

 FYI, nowaday popular databases aren't designed this way, exactly for the
 reasons I described above. All databases I worked with (MySQL, Oracle
 and Firebird) hold records in very compact way, without a single byte
 gaps, with ability of fragmentation, and without total physical
 ordering. So at least designers of these DBMS have agreed that your
 design is not practical.

Well they got to store a link somewhere, right? That's already 4 or 8 
bytes of "gap" to have in case of adding an extra field. In fact taking 
a peek at typical DB file I see LOTS of zeros in between real data.

Again - linked vs linear *is* a tradeoff, and it's taken not once and 
for all. It's good to have ability to choose, but from your posts it's 
apparent you don't need choice and just fine without it.

 Pointers are perfectly fine as long as there is no pointer arithmetic.

 Wrong. Merely holding a pointer (i.e. a physical address) is unsafe
 already.

Anyhow your definition of unsafe is hard to grasp.

 Non-deep serialization, or any other "preservation" of such a
 struct and  GC is unable to keep the track of pointers.

You are no better with references in these kind of circumstances. GC is 
able to track references/pointers, that's its job.

 Which is hopelessly admitting defeat. A pair may have non-trivial
 comparator. And a minor step beyond that such as a pair of double and
 integer and it fails completely.

 I said above: in any non-trivial case, use classes instead of
 overly-clever structures.

Why? I understand that where you are coming from there is simply no 
choice and hence the argument of 'just use this' holds water.

And this is nothing clever in a simple hierarchical composition of 
records stored in one piece of memory contiguously.

 And if you really, really need premature
 optimization, there is java.nio and buffers. Create ByteBuffer (direct
 one if you need super optimized solution) and treat slices of it as
 "structs".

It would be inferior to structs in many ways, including really slow 
binary serialization to the buffer step (Java has no way to just 
bit-blitting integers to byte array).

BTW I fail to see how NIO would instantly imply super optimized solution 
esp as it would require reconstructing fields from bytes on each access.

 That's possible and easy to implement, but really not needed
 in practice because all you get is 0.1% of memory saving and no gain in
 speed.

Since in Java you really can't measure this stuff (as there is no way to 
lay things out in memory) you would excuse me to not trust on this one.
Storing extra reference + an unused monitor field + vtable per a pair of 
2 integers is about 50% memory wasted.

 And there Java-style indirection-happy classes "shine". For instance
 modeling any complex stuff with classes alone would lead to things like:
 House--reference->Bedroom---reference-->Bed--reference-->Pillow

 Which is incredible waste of memory and speed on something so simple.

 That's not complex. That's very simple.

Oh, sure now I see let's discard things that don't seem to fit.
It's a common knowledge that Java doesn't implement composition 
properly, the only choice provided is delegation (i.e. keeping a 
reference instead of actually containing an object). There is simply no 
argument that's going to prove that "you really don't need composition".

 It would become slightly more
 complex if all house parts implemented the same root interface and basic
 operations, like examining what's around and finding an object given by
 its path or properties, or throwing an extra pillow or two onto the same
 bed. All that is just a dream for structs.

Again your ignorance of what a struct could do shows. They could have 
methods just fine. C++ std::vector is a struct after all, so an extra 
pillow goes just fine.

More over using such data structure with references everywhere ruins 
performance a topic you suddenly started to dismiss.

 Copy constructors or in D parlance postblits. There is also CoW and
 whatnot. In fact swap doesn't create a copy in D, it just does a
 bitwise swap in-place.

 And here we have a tough choice for structs with pulled subdata "for
 efficiency": either assignment operator copies that data too (making
 routines like sort to perform horribly slow), or they perform only
 shallow copy, causing undue sharing of data and violating the whole
 struct "value" philosophy.

Again - sort doesn't copy values at all, it swaps. And there are nice 
ways to design copy semantics like CoW.

There is a reason why Java String is now eagerly copied behind the 
scenes - as it's now (internally) using small string optimization 
instead of copy-on-write. It's funny how only JVM can do things on this 
"level of structs" in Java.

 Disagree - the moment a cache line is loaded it doesn't matter how
 much you read in this line. Even a tiny read missing a cache is paid
 in full.

 But the amount of these "missed reads" is low, so less amount of cache
 is invalidated. CPU cache is not a single page that gets invalidated as
 a whole, it's more like many small subpages, each of them is treaten
 individually.

I know what a CPU cache looks like.

 If you're really into the low-level mumbo jumbo, here are two more
 aspects for you to consider.

 1. Since indirection does not require the object contents to be moved
 around while sorting, these objects can be read into cache once and
 never invalidated later - unlike "right in place" structs which
 invalidate CPU cache each time a swap is performed. That is a huge cache
 win already.

Plain wrong. Again seeing you have strong Java background, isn't 
particularly surprising.

 2. Don't forget that new objects are created in eden space in Java -
 which is essentially the stack area. Very fast, compact, sequential
 memory. If your array fits in eden (and that's surely true for the
 forest problem this thread has started from), then allocating array of
 objects is essentially the same as allocating array of structs: object
 contents aren't "scattered in memory" but are located one-after-one
 without gaps between them.

Nobody told that an array is constructed in one pass or anything like 
that. There may easily (and in Java - surely) allocation in between. 
Just scratching the surface of how weak this argument is.

 That greatly aids caching as well.

Suddenly you too seem to understand that sequential storage aids CPU caches.

 The main
 difference between eden and "array of structs" is that the allocation of
 the former never fails (assuming there's enough memory), only gets
 slightly slower in the worst case, and allocation of the latter can fail
 because "stack overflow error"

Ignorance.

 or "too much memory fragmentation oops",
 even if there's enough free memory.

Now this makes sense but not enough to prove that "references are 
ultimately better" or any other general statement.

 And the only thing worse then copying a large file is copying a large
 file fragmented in tiny splinters.

 You surely meant "reading" here, not "copying". Copying large fragmented
 file is as slow as copying large unfragmented file, because writing
 operations are the bottleneck here.

Sequential writes are okay, seeking each block isn't.

-- 
Dmitry Olshansky

Jul 06 2014

"Wanderer" <no-reply no-reply.org> writes:

On Sunday, 6 July 2014 at 21:31:57 UTC, Dmitry Olshansky wrote:

 Strawman.
 ...
 Again your ignorance of what a struct could do shows.
 ...
 Plain wrong. Again seeing you have strong Java background, 
 isn't particularly surprising.
 ...
 Ignorance.

On Sunday, 6 July 2014 at 04:20:21 UTC, Timon Gehr wrote:
 This is just random nonsense.



I'm starting to think that D community should grow up few more 
years before reasonable discussion is possible.

Last few days, I became disappointed in D - not because the 
language is rather immature and unstable yet, but mainly because 
while you guys have attitude like this in discussions, the 
situation around D won't change to the better. You simply don't 
listen and blindly claim "nonsense" everything you don't 
understand.

I tried my best to stay reasonable and calm in this thread, but 
these last posts were way over the edge for me. Goodbye.

Jul 07 2014

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 7 July 2014 at 10:55:38 UTC, Wanderer wrote:
 I'm starting to think that D community should grow up few more 
 years before reasonable discussion is possible.

 Last few days, I became disappointed in D - not because the 
 language is rather immature and unstable yet, but mainly 
 because while you guys have attitude like this in discussions, 
 the situation around D won't change to the better. You simply 
 don't listen and blindly claim "nonsense" everything you don't 
 understand.

 I tried my best to stay reasonable and calm in this thread, but 
 these last posts were way over the edge for me. Goodbye.

Projection.

Jul 07 2014

"Daniel Murphy" <yebbliesnospam gmail.com> writes:

"Wanderer"  wrote in message news:jbvbufgyhbjrkpukrdtg forum.dlang.org...

 For pair of integers, you can use long and sort an array of longs.

Awesome, now your sort order depends on processor endianness!

Storing structs in contiguous memory is sometimes better for some things. 
The fact that sometimes it isn't better doesn't change this, but for some 
reason you've pointed out disadvantages...  What point are you trying to 
make?

Jul 05 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 07/05/2014 07:07 PM, Daniel Murphy wrote:
 "Wanderer"  wrote in message news:jbvbufgyhbjrkpukrdtg forum.dlang.org...

 For pair of integers, you can use long and sort an array of longs.

 Awesome, now your sort order depends on processor endianness!

?

k=i<<32|j, i=k>>>32, j=k&(1L<<32)-1.

Jul 05 2014

"Xinok" <xinok live.com> writes:

On Thursday, 3 July 2014 at 07:29:42 UTC, Wanderer wrote:
 Nobody, never, measures sort algorithms by amount of swaps.

Maybe not in swaps, but I have seen sorting algorithms measured 
similarly using reads and writes. As others have stated, it can 
be a useful metric if you're sorting a range of structs.

Another example, it's possible to implement an in-place merge 
sort. It performs just slightly more comparisons than a regular 
merge sort. However, it's still slower because it requires 
significantly more reads and writes.

It's not the only metric though, and other things are difficult 
to measure. A bottom-up heapsort performs fewer comparisons than 
quicksort, but quicksort is still faster. That's because 
quicksort runs over the range sequentially which makes it "cache 
friendly" whereas heapsort relies heavily on random access.

Metrics aside, nothing beats a good ol' benchmark.

Jul 03 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 23 Jun 2014 11:38:24 -0400, Chris Cain <zshazz gmail.com> wrote:

 On Monday, 23 June 2014 at 15:19:15 UTC, Steven Schveighoffer wrote:
 Is it just me, or does this seem unintuitive? I would think a stable  
 sort requires extra care, i.e. extra time, to ensure stability.

 Do we need an unstable sort then? Or is this a corner case? I am fully  
 ignorant on these advanced sorting routines and how they work. The  
 Quicksort-based sort routines are like black magic to me, my knowledge  
 stops at merge sort :)


 Technically, you can prove that there exists some unstable sort that is  
 always faster than a stable sort. Andrei showed me this once:

 [2,2,2,2,1]

 With unstable sort, the 1 and first 2 could be swapped, but with stable  
 sort, the 1 must swap with each 2 until it settles into the 0th position.

 That said, I've never seen the unstable sort that's always faster than  
 any stable sort and since it doesn't appear in any of the languages I  
 know of, I've got to guess that no one else has discovered it either.

Right. It's all dependent on whether the unstable sort can make the "right  
decisions." In my mind, ensuring stability seems like an extra task. In  
other words, (and I know it's not true) for every stable sort, there  
exists an equivalent unstable sort that does not have to do extra work to  
preserve ordering. But some sorts are stable by nature. It's probably a  
factor of the two sorts using different algorithms that happen to work  
better on some data sets.

 That said, I've seen some cases where the unstable sort is faster  
 (particularly in cases where the ordering of the elements is nearly  
 completely random), so it's not as if unstable sort is currently always  
 worse. It's just in many real-world scenarios it ends up being worse.  
 Furthermore, our stable sort (an implementation of Timsort) is  
 impossible to really use "incorrectly" ... even using it very "stupidly"  
 (as an example, append an element on the end of a sorted array and  
 "insert" it by calling stable sort) will result in reasonable runtimes  
 and do what the programmer expects of it. It makes writing simple quick  
 scripts and prototypes absurdly easy and still very efficient. Plus  
 maintaining the current order of the elements is really nice in general.

 Personally, I'm still in the camp of suggesting that our default should  
 be stable sort and if you're certain your use case will benefit from the  
 extra speed of unstable sort, you should specify it. But I don't think  
 that's going to change.

This seems like sound advice. An interesting "feature" may be to try and  
predict which sorts are working better for some process. In other words,  
choose a proportion of sorts to perform using stable, and some to use  
unstable. Then compare the speed per element, and adjust the proportion to  
favor the faster sort (obviously only when the sort method is not  
specified). I wonder how well this might perform for tasks that frequently  
sort. You could wrap the sorting routine with another function which does  
this.

-Steve

Jun 23 2014

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Saturday, 21 June 2014 at 01:07:20 UTC, safety0ff wrote:
 std.algorithm.sort with SwapStrategy.unstable is considerably 
 slower than his, whereas builtin sort is  abysmal.

 I find that generally using SwapStrategy.stable performs better.
 This isn't universal though as there are cases where it is 
 slower.

wat

Shouldn't unstable be faster, otherwise we might as well just 
alias unstable = stable!?

Jun 20 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Saturday, 21 June 2014 at 01:48:20 UTC, Peter Alexander wrote:
 On Saturday, 21 June 2014 at 01:07:20 UTC, safety0ff wrote:
 std.algorithm.sort with SwapStrategy.unstable is considerably 
 slower than his, whereas builtin sort is  abysmal.

 I find that generally using SwapStrategy.stable performs 
 better.
 This isn't universal though as there are cases where it is 
 slower.

 wat

 Shouldn't unstable be faster, otherwise we might as well just 
 alias unstable = stable!?

Many times my data is the result of mapping sorted/partially 
sorted input to a function which does not randomize order.
This creates many partially sorted runs.

Timsort is used for stable sorting and it was designed for this 
type of input.

Jun 20 2014

Joseph Rushton Wakeling via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 21/06/14 03:48, Peter Alexander via Digitalmars-d wrote:
 On Saturday, 21 June 2014 at 01:07:20 UTC, safety0ff wrote:
 std.algorithm.sort with SwapStrategy.unstable is considerably slower than his,
 whereas builtin sort is  abysmal.

 I find that generally using SwapStrategy.stable performs better.
 This isn't universal though as there are cases where it is slower.

 wat

 Shouldn't unstable be faster, otherwise we might as well just alias unstable =
 stable!?

I can't find the issue in bugzilla, but IIRC there was a problem with unstable 
sort where it would produce worst-case-scenario behaviour in the event of being 
given input that had only a few unsorted elements.

I ran into it when implementing some of the algorithms in Dgraph: appending one 
new element to an array and using sort() would generate quadratic behaviour.

That case of "mostly sorted" input seems to also be present in the problem 
described here.

Jun 21 2014

"Xinok" <xinok live.com> writes:

On Saturday, 21 June 2014 at 09:19:19 UTC, Joseph Rushton 
Wakeling via Digitalmars-d wrote:
 I can't find the issue in bugzilla, but IIRC there was a 
 problem with unstable sort where it would produce 
 worst-case-scenario behaviour in the event of being given input 
 that had only a few unsorted elements.

 I ran into it when implementing some of the algorithms in 
 Dgraph: appending one new element to an array and using sort() 
 would generate quadratic behaviour.

 That case of "mostly sorted" input seems to also be present in 
 the problem described here.

That has since been fixed. I implemented heapsort as a fallback 
algorithm, effectively making it an introsort.

Jun 21 2014

Joseph Rushton Wakeling via Digitalmars-d <digitalmars-d puremagic.com> writes:

On 21/06/14 17:28, Xinok via Digitalmars-d wrote:
 That has since been fixed. I implemented heapsort as a fallback algorithm,
 effectively making it an introsort.

Fantastic, thanks very much for that. :-)

Jun 21 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/21/14, 8:28 AM, Xinok wrote:
 On Saturday, 21 June 2014 at 09:19:19 UTC, Joseph Rushton Wakeling via
 Digitalmars-d wrote:
 I can't find the issue in bugzilla, but IIRC there was a problem with
 unstable sort where it would produce worst-case-scenario behaviour in
 the event of being given input that had only a few unsorted elements.

 I ran into it when implementing some of the algorithms in Dgraph:
 appending one new element to an array and using sort() would generate
 quadratic behaviour.

 That case of "mostly sorted" input seems to also be present in the
 problem described here.

 That has since been fixed. I implemented heapsort as a fallback
 algorithm, effectively making it an introsort.

Wait, when did that happen? I think a better solution would be to take 
median of more than 3 elements. -- Andrei

Jun 22 2014

"Xinok" <xinok live.com> writes:

On Sunday, 22 June 2014 at 08:08:44 UTC, Andrei Alexandrescu 
wrote:
 On 6/21/14, 8:28 AM, Xinok wrote:
 That has since been fixed. I implemented heapsort as a fallback
 algorithm, effectively making it an introsort.

 Wait, when did that happen? I think a better solution would be 
 to take median of more than 3 elements. -- Andrei

https://github.com/D-Programming-Language/phobos/pull/1735

I did experiment with taking a median of five. There wasn't much 
benefit to doing so and it added a considerable overhead. I 
decided it best to leave quicksort as it is. If anybody else can 
come up with something better, they're free to do a pull request.

The thing about introsort is that it guarantees non-quadratic 
time with almost zero overhead, something taking the median of N 
elements cannot guarantee... unless you take the median of 
hundreds of elements, which would be incredibly complex and slow.

Jun 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/22/14, 7:47 AM, Xinok wrote:
 On Sunday, 22 June 2014 at 08:08:44 UTC, Andrei Alexandrescu wrote:
 On 6/21/14, 8:28 AM, Xinok wrote:
 That has since been fixed. I implemented heapsort as a fallback
 algorithm, effectively making it an introsort.

 Wait, when did that happen? I think a better solution would be to take
 median of more than 3 elements. -- Andrei

 https://github.com/D-Programming-Language/phobos/pull/1735

Thanks. I wonder if it's possible to factor some of the heap sort code 
with the BinaryHeap code; they look very similar.

 I did experiment with taking a median of five. There wasn't much benefit
 to doing so and it added a considerable overhead. I decided it best to
 leave quicksort as it is. If anybody else can come up with something
 better, they're free to do a pull request.

I was thinking of sorting in place a stride (every k elements in the 
original array) first; then the median is the middle of the stride. The 
length of the stride would be proportional to log(n).

Anyhow, I see you have a test with this array:

foreach(k; 0..arr.length) arr[k] = k;
swapRanges(arr[0..$/2], arr[$/2..$]);

and you count the comparisons. Nice! We should add arrays with other 
statistics, too (e.g.: presorted, random uniform, random Gaussian). Also 
we should count the number of swaps, the other primitive operation in 
sorting.

 The thing about introsort is that it guarantees non-quadratic time with
 almost zero overhead, something taking the median of N elements cannot
 guarantee... unless you take the median of hundreds of elements, which
 would be incredibly complex and slow.

http://dlang.org/phobos/std_algorithm.html#topN is linear.


Andrei

Jun 22 2014

"Xinok" <xinok live.com> writes:

On Sunday, 22 June 2014 at 15:51:13 UTC, Andrei Alexandrescu 
wrote:
 On 6/22/14, 7:47 AM, Xinok wrote:
 On Sunday, 22 June 2014 at 08:08:44 UTC, Andrei Alexandrescu 
 wrote:
 On 6/21/14, 8:28 AM, Xinok wrote:
 That has since been fixed. I implemented heapsort as a 
 fallback
 algorithm, effectively making it an introsort.

 Wait, when did that happen? I think a better solution would 
 be to take
 median of more than 3 elements. -- Andrei

 https://github.com/D-Programming-Language/phobos/pull/1735

 Thanks. I wonder if it's possible to factor some of the heap 
 sort code with the BinaryHeap code; they look very similar.

That's a possibility, and BinaryHeap could benefit from it. My 
heapsort is over twice as fast and performs almost half the 
number of comparisons.

 I did experiment with taking a median of five. There wasn't 
 much benefit
 to doing so and it added a considerable overhead. I decided it 
 best to
 leave quicksort as it is. If anybody else can come up with 
 something
 better, they're free to do a pull request.

 I was thinking of sorting in place a stride (every k elements 
 in the original array) first; then the median is the middle of 
 the stride. The length of the stride would be proportional to 
 log(n).

 Anyhow, I see you have a test with this array:

 foreach(k; 0..arr.length) arr[k] = k;
 swapRanges(arr[0..$/2], arr[$/2..$]);

 and you count the comparisons. Nice! We should add arrays with 
 other statistics, too (e.g.: presorted, random uniform, random 
 Gaussian). Also we should count the number of swaps, the other 
 primitive operation in sorting.

The only reason I added that test is to ensure it's falling back 
to heapsort properly. My intention was not to collect statistics; 
I simply couldn't think of a simpler or more direct way of 
performing this test.

 The thing about introsort is that it guarantees non-quadratic 
 time with
 almost zero overhead, something taking the median of N 
 elements cannot
 guarantee... unless you take the median of hundreds of 
 elements, which
 would be incredibly complex and slow.

 http://dlang.org/phobos/std_algorithm.html#topN is linear.

Unfortunately not. I managed to trigger quadratic time using the 
same test case as above. If you're not aware, topN simply works 
by recursively partitioning the range until the Nth element is in 
place (in essence, it performs a partial quick sort). The author 
of this implementation made the poor decision of simply choosing 
the last element as the pivot, not even a median of three.

http://dpaste.dzfl.pl/f30fd2539f66

Jun 22 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/22/14, 10:07 AM, Xinok wrote:
 On Sunday, 22 June 2014 at 15:51:13 UTC, Andrei Alexandrescu wrote:
 On 6/22/14, 7:47 AM, Xinok wrote:
 On Sunday, 22 June 2014 at 08:08:44 UTC, Andrei Alexandrescu wrote:
 On 6/21/14, 8:28 AM, Xinok wrote:
 That has since been fixed. I implemented heapsort as a fallback
 algorithm, effectively making it an introsort.

 Wait, when did that happen? I think a better solution would be to take
 median of more than 3 elements. -- Andrei

 https://github.com/D-Programming-Language/phobos/pull/1735

 Thanks. I wonder if it's possible to factor some of the heap sort code
 with the BinaryHeap code; they look very similar.

 That's a possibility, and BinaryHeap could benefit from it. My heapsort
 is over twice as fast and performs almost half the number of comparisons.

Smile, you're on camera! https://issues.dlang.org/show_bug.cgi?id=12966

 http://dlang.org/phobos/std_algorithm.html#topN is linear.

 Unfortunately not. I managed to trigger quadratic time using the same
 test case as above. If you're not aware, topN simply works by
 recursively partitioning the range until the Nth element is in place (in
 essence, it performs a partial quick sort). The author of this
 implementation made the poor decision of simply choosing the last
 element as the pivot, not even a median of three.

 http://dpaste.dzfl.pl/f30fd2539f66

That would be me, terrible, I know :o). I'd file a task a while ago, now 
I assigned you to it if you'd like to take a look: 
https://issues.dlang.org/show_bug.cgi?id=12141


Thanks,

Andrei

Jun 22 2014

D Programming

C/C++ Programming

Other

digitalmars.D - Optimizing Java using D