digitalmars.D.learn - Quadratic time to sort in a simple case?

Ivan Kazmenko (40/40) Apr 19 2013 Hi!

bearophile (8/17) Apr 19 2013 If you know that, then don't do a sort. Make some free space in

Ivan Kazmenko (29/38) Apr 19 2013 Thank you, these are good for now.

bearophile (19/38) Apr 19 2013 With that I think there is no way to write a pure sort. Hopefully

Chris Cain (28/35) Apr 20 2013 I'm just throwing my 2c in, but I think TimSort ought to be the

Dmitry Olshansky (5/23) Apr 20 2013 And this all is good but TimSort allocates O(N) memory. The constant in

Xinok (5/8) Apr 22 2013 Worst case is O(n/2), but it starts small and doubles in size as

Dmitry Olshansky (5/10) Apr 24 2013 Good to know, still it's something to keep in mind.

Andrei Alexandrescu (14/46) Apr 22 2013 Yah, that would mean one can't reason what a piece of code does without

bearophile (9/13) Apr 22 2013 Or switch to a sort that is guaranteed to have a pseudo-linear

bearophile (4/4) Apr 22 2013 And by the way, I still suggest a dual-pivot quicksort:
Andrei Alexandrescu (12/22) Apr 23 2013 There's not "the C++ STL sort". Any implementation is fine as long as it...

bearophile (7/9) Apr 23 2013 This seems to use the Introsort:

Xinok (18/27) Apr 23 2013 On an average case, two-pivot quick sort will increase the number

Ivan Kazmenko (13/24) Apr 23 2013 I like the sound of that!

Ivan Kazmenko (59/71) Apr 23 2013 Sounds like it could decrease speed on average, and still perhaps

Dmitry Olshansky (7/18) Apr 24 2013 A good unstable sort can do the job faster (at the very least not

Xinok (8/9) Apr 24 2013 I would actually argue for this, not for safety (introsort is an

Dmitry Olshansky (4/7) Apr 19 2013 Optimization flags if any?

Ivan Kazmenko (10/16) Apr 19 2013 Both "-O" and no-flags give quadratic behavior.

Dmitry Olshansky (8/22) Apr 19 2013 I sought after

Xinok (5/13) Apr 22 2013 I filed a bug report for this issue a year ago:

bearophile (6/8) Apr 22 2013 Good. And if you are willing, please also take a look at the

Xinok (4/12) Apr 22 2013 I could, but I'm not sure what the general consensus is on

"Ivan Kazmenko" <gassa mail.ru> writes:

Hi!

Consider a sorted array.  Append an element that is less than all 
the previous elements.  Then sort the array again by the sort 
function from std.algorithm.

An example follows:

-----
import std.algorithm;
import std.datetime;
import std.range;
import std.stdio;

void main ()
{
	int n = 30_000;
	auto a = array (iota (n)) ~ [0];
	auto st = Clock.currTime ();
	sort (a);
	auto et = Clock.currTime ();
	writeln (et - st);
}
-----

With n = 30_000 as in the example, this takes time of the order 
of a second on a modern computer, which is clearly O(n^2).  I am 
using DMD 2.062.

Well, now I have a point to make and a question.

My point is that I think this is bad because such a pattern 
(sorted array ~ small element, then sort) is fairly likely to 
occur - well, it did for me, hence the post.  Sure, every fast 
and simple method for choosing the pivot in quicksort will have 
its O(n^2) corner cases.  But there's a difference between a 
corner case that never occurs in "real" data (but still could be 
constructed artificially) and a corner case describable by such a 
simple pattern.

The question is rather predictable, really: what is the 
guaranteed-n-log-n replacement for sort in the standard library?  
I've found the following way...
	sort !("a < b", SwapStrategy.stable) (a);
...but it forces to specify the redundant "a < b" and looks a bit 
clumsy for an everyday sort() call.

Tried to keep it short(er) this time,

Ivan Kazmenko.

Apr 19 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Ivan Kazmenko:

 Consider a sorted array.  Append an element that is less than 
 all the previous elements.  Then sort the array again by the 
 sort function from std.algorithm.

If you know that, then don't do a sort. Make some free space in
the array, shift the items, sort just the first part with a slice.


 The question is rather predictable, really: what is the 
 guaranteed-n-log-n replacement for sort in the standard 
 library?  I've found the following way...
 	sort !("a < b", SwapStrategy.stable) (a);

That uses the TimSort that contains the optimization you look for.


 ...but it forces to specify the redundant "a < b" and looks a 
 bit clumsy for an everyday sort() call.

Then try to define an alias (untested):

alias mySort = sort!("a < b", SwapStrategy.stable);

Bye,
bearophile

Apr 19 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

 That [SwapStrategystable] uses the TimSort that contains
 the optimization you look for.

 alias mySort = sort!("a < b", SwapStrategy.stable);

Thank you, these are good for now.

A seemingly easy way to provide the choice globally would be to 
have an assignable SwapStrategy.default in std.algorithm... or 
would that be a bad design decision to have such a global state?

 Consider a sorted array.  Append an element that is less than 
 all the previous elements.  Then sort the array again by the 
 sort function from std.algorithm.

 If you know that, then don't do a sort. Make some free space in
 the array, shift the items, sort just the first part with a 
 slice.

That behavior isn't always easy to predict.

Still, I think this is a problem in Phobos which should be fixed. 
  In most implementations of quicksort (even middle-element, the 
easiest to come up with), one expects it to perform in O(n log n) 
with probability close to one, except on some artificially 
constructed cases.

On a side note, I'm surprised that median-of-three (getPivot in 
std.algorithm) fails at such a simple test, so I had something 
new to learn there, too.  If I just comment out the switch in 
getPivot and "return mid" directly, it becomes fast enough in 
this case.

Another view at this is the following.  As far as I know by now, 
one of the points beyond D (and Phobos) design is to have 
everything safe by default, but faster if you want it explicitly 
and you know what you are doing.  In this particular case, that 
would mean having a worst-case O(n log n) sort, and/or a stable 
sort, by default - but with the opportunity to switch to the 
(time-wise and stability-wise) unsafe quicksort if you really 
need the extra speedup and know what you are doing.  So, why 
isn't TimSort the default?

The one reason I can imagine is that then D would suffer in 
language comparisons which just take the most easy-to-use 
"default sort" and don't care about the extra features it got.  
Are there any other?

Ivan Kazmenko.

Apr 19 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Ivan Kazmenko:

 A seemingly easy way to provide the choice globally would be to 
 have an assignable SwapStrategy.default in std.algorithm... or 
 would that be a bad design decision to have such a global state?

With that I think there is no way to write a pure sort. Hopefully 
someday the sort() will be pure. Generally global mutable state 
is bad to write unittests, and to reason about code. So I don't 
think Andrei will do that. It's not kosher.


 Still, I think this is a problem in Phobos which should be 
 fixed.
  In most implementations of quicksort (even middle-element, the 
 easiest to come up with), one expects it to perform in O(n log 
 n) with probability close to one, except on some artificially 
 constructed cases.

 On a side note, I'm surprised that median-of-three (getPivot in 
 std.algorithm) fails at such a simple test, so I had something 
 new to learn there, too.  If I just comment out the switch in 
 getPivot and "return mid" directly, it becomes fast enough in 
 this case.

Your case is a common one, so I think this problem should be 
studied a little better. I suggest to write a little better 
benchmark that shows the problem, and put it in Bugzilla. At 
worst it will be closed with wontfix.


 Another view at this is the following.  As far as I know by 
 now, one of the points beyond D (and Phobos) design is to have 
 everything safe by default, but faster if you want it 
 explicitly and you know what you are doing.

Right, but I think that D Zen rule is mostly about things like 
memory safety, etc.


 So, why isn't TimSort the default?

Probably because someone thinks that "on average" the other sort 
is faster.

In theory the other should be faster, because if you relax the 
constraints of the sort being stable I think in theory you should 
be able to write a little faster sorting algorithm (I don't know 
if this is true).

Bye,
bearophile

Apr 19 2013

"Chris Cain" <clcain uncg.edu> writes:

On Friday, 19 April 2013 at 22:56:00 UTC, bearophile wrote:
 So, why isn't TimSort the default?

 Probably because someone thinks that "on average" the other 
 sort is faster.

 In theory the other should be faster, because if you relax the 
 constraints of the sort being stable I think in theory you 
 should be able to write a little faster sorting algorithm (I 
 don't know if this is true).

I'm just throwing my 2c in, but I think TimSort ought to be the 
default. It's simply the safer option. Since worst-case 
performance can be designed (and it can be designed unless the 
pivots are, at least, pseudo-randomly chosen), there is a very 
real risk of the default sort being used in a way that can be 
compromised by an attacker. From this perspective, seems to be 

"Make doing things the right way easier than the wrong way."

Plus, TimSort seems to be faster in most cases in my attempts. 
The only cases I could find that it wasn't faster is when I could 
guarantee that the data I was passing in had no order to it. In 
cases where you suspect (but can't guarantee) data is sorted when 
passed in (think fetching from a web API and it gives you sorted 
data, but the docs don't say it should), calling TimSort is 
nearly as fast as calling an "isSorted" ... So, my recommendation 
is to just call TimSort and don't worry about the extra branching 
code ([checking sortedness, if not sorted, call sort] vs [call 
sort]). This makes it so the "fast" code is also very convenient 
to write.

TBH, though, I think the reason that TimSort is not the default 
is because it was added only semi-recently. The old stable sort 
was not nearly as fast as the unstable sort (and, in fact, IIRC, 
it didn't work properly when I tried it). I doubt that someone 
intentionally said that quicksort was faster than TimSort on 
average ... it's just that no one thought to change the default 
when TimSort was added.

Maybe an enhancement request could be made on this?

Apr 20 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

20-Apr-2013 16:22, Chris Cain пишет:
 On Friday, 19 April 2013 at 22:56:00 UTC, bearophile wrote:
 So, why isn't TimSort the default?

 Probably because someone thinks that "on average" the other sort is
 faster.

 In theory the other should be faster, because if you relax the
 constraints of the sort being stable I think in theory you should be
 able to write a little faster sorting algorithm (I don't know if this
 is true).

 I'm just throwing my 2c in, but I think TimSort ought to be the default.
 It's simply the safer option. Since worst-case performance can be
 designed (and it can be designed unless the pivots are, at least,
 pseudo-randomly chosen), there is a very real risk of the default sort
 being used in a way that can be compromised by an attacker. From this
 perspective, seems to be like TimSort being default would support the

 wrong way."

And this all is good but TimSort allocates O(N) memory. The constant in 
front of N is smallish less then 1.0 but it could cause some grief.


-- 
Dmitry Olshansky

Apr 20 2013

"Xinok" <xinok live.com> writes:

On Saturday, 20 April 2013 at 16:35:25 UTC, Dmitry Olshansky 
wrote:
 And this all is good but TimSort allocates O(N) memory. The 
 constant in front of N is smallish less then 1.0 but it could 
 cause some grief.

Worst case is O(n/2), but it starts small and doubles in size as 
needed. On a partially sorted array, it may only use a small 
amount of memory.

Apr 22 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Apr-2013 05:17, Xinok пишет:
 On Saturday, 20 April 2013 at 16:35:25 UTC, Dmitry Olshansky wrote:
 And this all is good but TimSort allocates O(N) memory. The constant
 in front of N is smallish less then 1.0 but it could cause some grief.

 Worst case is O(n/2), but it starts small and doubles in size as needed.
 On a partially sorted array, it may only use a small amount of memory.

Good to know, still it's something to keep in mind.
Especially for allocation-free minded folks.

-- 
Dmitry Olshansky

Apr 24 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/19/13 6:37 PM, Ivan Kazmenko wrote:
 That [SwapStrategystable] uses the TimSort that contains
 the optimization you look for.

 alias mySort = sort!("a < b", SwapStrategy.stable);

 Thank you, these are good for now.

 A seemingly easy way to provide the choice globally would be to have an
 assignable SwapStrategy.default in std.algorithm... or would that be a
 bad design decision to have such a global state?

Yah, that would mean one can't reason what a piece of code does without 
knowing what the context was.

 Consider a sorted array. Append an element that is less than all the
 previous elements. Then sort the array again by the sort function
 from std.algorithm.

 If you know that, then don't do a sort. Make some free space in
 the array, shift the items, sort just the first part with a slice.

 That behavior isn't always easy to predict.

 Still, I think this is a problem in Phobos which should be fixed. In
 most implementations of quicksort (even middle-element, the easiest to
 come up with), one expects it to perform in O(n log n) with probability
 close to one, except on some artificially constructed cases.

 On a side note, I'm surprised that median-of-three (getPivot in
 std.algorithm) fails at such a simple test, so I had something new to
 learn there, too. If I just comment out the switch in getPivot and
 "return mid" directly, it becomes fast enough in this case.

I could have sworn I made getPivot choose at random at some point. I 
agree this should be fixed; there are a number of possibilities:

a) choose median of five

b) if the array is large, sort a stride of it first (e.g. 1%) then 
choose the pivot as the median point of that stride

c) add introspection to the algorithm, i.e. if an attempt to partition 
hits the worst case or near worst case, just try another pivot instead 
of moving forward with the sorting stage.

 Another view at this is the following. As far as I know by now, one of
 the points beyond D (and Phobos) design is to have everything safe by
 default, but faster if you want it explicitly and you know what you are
 doing.

That view of safety is different (memory safety).

 In this particular case, that would mean having a worst-case O(n
 log n) sort, and/or a stable sort, by default - but with the opportunity
 to switch to the (time-wise and stability-wise) unsafe quicksort if you
 really need the extra speedup and know what you are doing. So, why isn't
 TimSort the default?

TimSort is slower on average.


Andrei

Apr 22 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 c) add introspection to the algorithm, i.e. if an attempt to 
 partition hits the worst case or near worst case, just try 
 another pivot instead of moving forward with the sorting stage.

Or switch to a sort that is guaranteed to have a pseudo-linear 
complexity.
I am not sure, but I think the C++ STL sort does this.


 TimSort is slower on average.

Here it's not easy to define what "average" is. Python devs think 
that a common case is an array with mostly sorted data with 
unsorted data at the end.

Bye,
bearophile

Apr 22 2013

"bearophile" <bearophileHUGS lycos.com> writes:

And by the way, I still suggest a dual-pivot quicksort:

http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/2628

Bye,
bearophile

Apr 22 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 4/22/13 5:52 PM, bearophile wrote:
 Andrei Alexandrescu:

 c) add introspection to the algorithm, i.e. if an attempt to partition
 hits the worst case or near worst case, just try another pivot instead
 of moving forward with the sorting stage.

 Or switch to a sort that is guaranteed to have a pseudo-linear complexity.
 I am not sure, but I think the C++ STL sort does this.

There's not "the C++ STL sort". Any implementation is fine as long as it 
has O(n log n) expected run time.

 TimSort is slower on average.

 Here it's not easy to define what "average" is. Python devs think that a
 common case is an array with mostly sorted data with unsorted data at
 the end.

Note that in this case it's sorted data followed by its smallest 
element. (Changing the value does improve sped.) This is hitting a 
corner case: median of 3 will pick the smallest element, which will lead 
to an inefficient partition. I haven't looked at the code, but it seems 
the smallest element again makes it to the beginning and the end of the 
array so it's again picked etc.

One simple solution to this (which I forgot to mention) is picking a 
pivot at random.


Andrei

Apr 23 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 There's not "the C++ STL sort". Any implementation is fine as 
 long as it has O(n log n) expected run time.

This seems to use the Introsort:
http://www.sgi.com/tech/stl/sort.html

I don't know if Xinok has tested a 2-pivot quicksort (called by
the usual Introsort setting).

Bye,
bearophile

Apr 23 2013

"Xinok" <xinok live.com> writes:

On Tuesday, 23 April 2013 at 14:12:01 UTC, bearophile wrote:
 Andrei Alexandrescu:

 There's not "the C++ STL sort". Any implementation is fine as 
 long as it has O(n log n) expected run time.

 This seems to use the Introsort:
 http://www.sgi.com/tech/stl/sort.html

 I don't know if Xinok has tested a 2-pivot quicksort (called by
 the usual Introsort setting).

 Bye,
 bearophile

On an average case, two-pivot quick sort will increase the number 
of comparisons by about 50%, reason being that about half the 
elements will be greater than the first pivot and have to he 
compared to the second pivot. There's also a greater complexity 
given there are three partitions rather than two, increasing the 
amount of I/O necessary.

Introsort is simply a quick sort which falls back to a heap sort 
after so many recursions to guarantee O(n log n) performance. 
I've implemented this using a median of five and it works 
excellent. This is what I plan to contribute to Phobos.

Choosing a random pivot will always invoke "average" performance 
where as finding the median of N elements usually achieves better 
performance on sorted lists. You also have to be careful that 
equal elements are distributed equally among both partitions, 
else a list with many elements equal to the pivot will still 
achieve poor performance. (equal elements would all fall onto one 
side of the pivot)

Apr 23 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Tuesday, 23 April 2013 at 17:42:08 UTC, Xinok wrote:

 Choosing a random pivot will always invoke "average" 
 performance where as finding the median of N elements usually 
 achieves better performance on sorted lists. You also have to 
 be careful that equal elements are distributed equally among 
 both partitions, else a list with many elements equal to the 
 pivot will still achieve poor performance. (equal elements 
 would all fall onto one side of the pivot)

 Introsort is simply a quick sort which falls back to a heap 
 sort after so many recursions to guarantee O(n log n) 
 performance. I've implemented this using a median of five and 
 it works excellent. This is what I plan to contribute to Phobos.

I like the sound of that!

Regarding my previous post, it's perhaps worth mentioning Go in 
the comparison, it uses QuickSort with median-of-three for small 
sizes and median of medians-of-three for larger sizes [1].  And 
it actually sorts the three medians in median-of-three, which 
sounds like a good thing to do.  They cite "Engineering a Sort 
Function" (1993) by Bentley and McIlroy as inspiration [2].

Ivan Kazmenko.

-----

[1] http://golang.org/src/pkg/sort/sort.go#L104
[2] 
http://www.enseignement.polytechnique.fr/informatique/profs/Luc.Maranget/421/09/bentley93engineering.pdf

Apr 23 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

On Monday, 22 April 2013 at 21:34:47 UTC, Andrei Alexandrescu 
wrote:

 a) choose median of five

Sounds like it could decrease speed on average, and still perhaps 
fail on a simple case like [0, 1, 2, 3, ..., 999, 0, 0] or 
something?  Didn't test it though.

 b) if the array is large, sort a stride of it first (e.g. 1%) 
 then choose the pivot as the median point of that stride

A bad case would be, e.g., the array containing almost sorted 
parts, and the stride getting taken from the part where lowest or 
highest elements reside.

 c) add introspection to the algorithm, i.e. if an attempt to 
 partition hits the worst case or near worst case, just try 
 another pivot instead of moving forward with the sorting stage.

Now, trying another pivot again and again could give the same 
O(n^2) in a bad case.

 One simple solution to this (which I forgot to mention) is
 picking a pivot at random.

But it would again mean that the sort function depends on global 
state (RNG state).  And if it doesn't (a local RNG is initialized 
somehow each time), an adversary would just have to hardcode it 
once to get the same O(n^2) worst case anytime he wants.

And on Tuesday, 23 April 2013 at 01:10:26 UTC, Xinok wrote:
 I filed a bug report for this issue a year ago:
 http://d.puremagic.com/issues/show_bug.cgi?id=7767

 I've been meaning to fix this issue myself. Time allowing, I'll 
 do it soon.

What I wonder now, what would be the goal of such a fix?

One possible goal would be to have an O(n log n) worst case sort. 
  And that would perhaps cost some speed and/or memory on average. 
  Besides, such a sort function is already implemented (TimSort), 
so it's just a matter of setting the default then.

Another goal would be to have O(n^2) only for cases which don't 
have a reasonable chance to occur in real world, but could still 
be constructed artificially by an adversary.  And that could 
perhaps be done by just changing the getPivot implementation.  
Other languages' experience may be useful here:

1. GNU C++ 4.x std::sort implementation is IntroSort [1].
2. Microsoft .NET used Quicksort up to version 4.0 [2].
    Now in .NET 4.5 they use Introsort too [3].
3. For primitive types, Java 6 used a tuned QuickSort [4].
    The current Java 7 uses Dual-Pivot QuickSort [5].
    Java uses TimSort for Objects [6].
4. Python uses TimSort since version 2.3 [7].

In any case, there are techniques to construct a bad case given a 
QuickSort implementation.  One of them is described by M. Douglas 
McIlroy in "A Killer Adversary for Quicksort" [8].  We run 
QuickSort on a would-be-array "a" controlling the "a[i] < a[j]" 
predicate, and, using only certain assumptions (not looking at 
the code!), we build a killer-case array on the fly.  Given some 
thought, the technique could perhaps be extended to Dual-Pivot 
QuickSort too.  As long as some flavor of QuickSort (and topN) is 
in Phobos, such a unittest would fit the sort implementation 
nicely, if only to show just how "bad" it can get.

Ivan Kazmenko.

-----

References:
[1] 
http://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.4/a01027.html#g152148508b4a39e15ffbfbc987ab653a
[2] 
http://msdn.microsoft.com/en-us/library/6tf1f0bc%28v=vs.100%29.aspx
[3] 
http://msdn.microsoft.com/en-us/library/6tf1f0bc%28v=vs.110%29.aspx
[4] 
http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html#sort%28int[]%29
[5] 
http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort%28int[]%29
[6] 
http://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#sort%28java.lang.Object[]%29
[7] http://en.wikipedia.org/wiki/Timsort
[8] http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

Apr 23 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Apr-2013 01:09, Ivan Kazmenko пишет:
 And on Tuesday, 23 April 2013 at 01:10:26 UTC, Xinok wrote:
 I filed a bug report for this issue a year ago:
 http://d.puremagic.com/issues/show_bug.cgi?id=7767

 I've been meaning to fix this issue myself. Time allowing, I'll do it
 soon.

 What I wonder now, what would be the goal of such a fix?

 One possible goal would be to have an O(n log n) worst case sort.  And
 that would perhaps cost some speed and/or memory on average.  Besides,
 such a sort function is already implemented (TimSort), so it's just a
 matter of setting the default then.

A good unstable sort can do the job faster (at the very least not 
slower) then a good stable sort.

I'm looking forward to a version of Introsort that Xinok has in mind as 
a "Q-sort fix".



-- 
Dmitry Olshansky

Apr 24 2013

"Xinok" <xinok live.com> writes:

On Friday, 19 April 2013 at 22:37:45 UTC, Ivan Kazmenko wrote:
 So, why isn't TimSort the default?

I would actually argue for this, not for safety (introsort is an 
adequate solution) but for different reasons. Timsort is stable 
and it's faster on data with low entropy, being nearly 
instantaneous on already sorted lists. I would guess it's the 
better choice for most cases. Then for those cases where stable 
sorting isn't necessary and unstable sorting would be faster, the 
user could choose the second option.

Apr 24 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

20-Apr-2013 01:03, Ivan Kazmenko пишет:
 With n = 30_000 as in the example, this takes time of the order of a
 second on a modern computer, which is clearly O(n^2).  I am using DMD
 2.062.

Optimization flags if any?


-- 
Dmitry Olshansky

Apr 19 2013

"Ivan Kazmenko" <gassa mail.ru> writes:

 With n = 30_000 as in the example, this takes time of the 
 order of a
 second on a modern computer, which is clearly O(n^2).  I am 
 using DMD
 2.062.

 Optimization flags if any?

Both "-O" and no-flags give quadratic behavior.

Well, if that does not convince you the time grows faster than 
n-log-n, maybe this comparison shall (dmd -O, Core i7-2600K  3400 
MHz):

n       T, seconds
  15_000   0.312
  30_000   1.076
  60_000   3.775
120_000  11.247

With n growing x2, the time grows x3 to x4 at each step.

Apr 19 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

20-Apr-2013 02:12, Ivan Kazmenko пишет:
 With n = 30_000 as in the example, this takes time of the order of a
 second on a modern computer, which is clearly O(n^2).  I am using DMD
 2.062.

 Optimization flags if any?

 Both "-O" and no-flags give quadratic behavior.

I sought after
-O -inline -release

In non-release mode it tests that it's sorted on exit etc.
Anyway it's 8x times as fast with -release -inline for me but the 
progression is similarly bad.

 Well, if that does not convince you the time grows faster than n-log-n,
 maybe this comparison shall (dmd -O, Core i7-2600K  3400 MHz):

 n       T, seconds
   15_000   0.312
   30_000   1.076
   60_000   3.775
 120_000  11.247

 With n growing x2, the time grows x3 to x4 at each step.



-- 
Dmitry Olshansky

Apr 19 2013

"Xinok" <xinok live.com> writes:

On Friday, 19 April 2013 at 21:03:23 UTC, Ivan Kazmenko wrote:
 Hi!

 Consider a sorted array.  Append an element that is less than 
 all the previous elements.  Then sort the array again by the 
 sort function from std.algorithm.

 ....

 With n = 30_000 as in the example, this takes time of the order 
 of a second on a modern computer, which is clearly O(n^2).  I 
 am using DMD 2.062.

I filed a bug report for this issue a year ago:
http://d.puremagic.com/issues/show_bug.cgi?id=7767

I've been meaning to fix this issue myself. Time allowing, I'll 
do it soon.

Apr 22 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Xinok:

 I've been meaning to fix this issue myself. Time allowing, I'll 
 do it soon.

Good. And if you are willing, please also take a look at the 
parallel sort problem I've raised:

http://forum.dlang.org/thread/iyunhhsbmurqyouyrhob forum.dlang.org

Bye,
bearophile

Apr 22 2013

"Xinok" <xinok live.com> writes:

On Tuesday, 23 April 2013 at 01:28:56 UTC, bearophile wrote:
 Xinok:

 I've been meaning to fix this issue myself. Time allowing, 
 I'll do it soon.

 Good. And if you are willing, please also take a look at the 
 parallel sort problem I've raised:

 http://forum.dlang.org/thread/iyunhhsbmurqyouyrhob forum.dlang.org

 Bye,
 bearophile

I could, but I'm not sure what the general consensus is on 
std.parallel_algorithm as it's been brought up before in a 
similar context.

Apr 22 2013

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Quadratic time to sort in a simple case?