digitalmars.D - How to override opCmp when two different sorts are needed?

John Bartelt (3/3) Aug 13 2007 I have a class that contains two variables. Sometimes I need to sort an ...

Jarrett Billingsley (6/14) Aug 14 2007 Write your own sort routine which takes a given predicate. :\

Bill Baxter (8/26) Aug 14 2007 There's also Oskar's sort w/predicate
C. Dunn (2/7) Aug 14 2007 It's not just a matter of elegance. C++ std::sort() is way, way faster ...

janderson (11/19) Aug 14 2007 In my experience/benchmarks (MS VC++ additions of C++ 2002/2003/2005)
Regan Heath (4/15) Aug 15 2007 I believe the intention for D is that the optimizer will automagically

Deewiant (35/43) Aug 15 2007 I'd imagine it's quite hard to inline a delegate or function pointer. In...

Regan Heath (10/57) Aug 15 2007 True about delegates/fpointers I reckon.

Deewiant (70/86) Aug 15 2007 It's not _bad_ per se. I understand the problem is difficult and that Wa...

Frits van Bommel (3/11) Aug 15 2007 The third one is actually equivalent to array[x..y] and not to the other...

Deewiant (5/17) Aug 15 2007 Good catch, you're right. It should of course be array[0..x+y][x..$].

Regan Heath (3/19) Aug 15 2007 I prefer the first form, I find the rest confusing.

BCS (13/24) Aug 14 2007 you could try some dirty hacks:

Frits van Bommel (3/5) Aug 14 2007 That probably looks better as "(cast(FooWrap[]) f).sort", which should

BCS (4/12) Aug 14 2007 yes but you might be able to slip that past in a code review (if coffee ...

John Bartelt <bartelt cox.net> writes:

I have a class that contains two variables. Sometimes I need to sort an array
of the class based on variable 1, and later in the same progam sort the array
based on variable 2.

Can someone give me an idea of how to do this (smartly, efficiently, cleverly,
polymorphically, ...)?

Best idea that I've had so far is to use a static flag in the class to indicate
which sort is desired, and then to have the overloaded opCmp() check the flag
before comparing.

Aug 13 2007

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"John Bartelt" <bartelt cox.net> wrote in message 
news:f9quel$rou$1 digitalmars.com...
I have a class that contains two variables. Sometimes I need to sort an 
array of the class based on variable 1, and later in the same progam sort 
the array based on variable 2.

 Can someone give me an idea of how to do this (smartly, efficiently, 
 cleverly, polymorphically, ...)?

 Best idea that I've had so far is to use a static flag in the class to 
 indicate which sort is desired, and then to have the overloaded opCmp() 
 check the flag before comparing.

Write your own sort routine which takes a given predicate.  :\

Tango provides such a routine in tango.core.Array; phobos provides no such 
analogue.  With phobos, either write your own routine, or use 
std.c.stdlib.qsort.  I'll let you decide which is more elegant.

Aug 14 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Jarrett Billingsley wrote:
 "John Bartelt" <bartelt cox.net> wrote in message 
 news:f9quel$rou$1 digitalmars.com...
 I have a class that contains two variables. Sometimes I need to sort an 
 array of the class based on variable 1, and later in the same progam sort 
 the array based on variable 2.

 Can someone give me an idea of how to do this (smartly, efficiently, 
 cleverly, polymorphically, ...)?

 Best idea that I've had so far is to use a static flag in the class to 
 indicate which sort is desired, and then to have the overloaded opCmp() 
 check the flag before comparing.

 
 Write your own sort routine which takes a given predicate.  :\
 
 Tango provides such a routine in tango.core.Array; phobos provides no such 
 analogue.  With phobos, either write your own routine, or use 
 std.c.stdlib.qsort.  I'll let you decide which is more elegant. 

There's also Oskar's sort w/predicate
     http://www.csc.kth.se/~ol/array.d

I thought the question was more about how to organize the software to 
control the change in behavior though.  For that I think you'll have to 
think about who "owns" the change in behavior and to whom the change 
should be visible to.

--bb

Aug 14 2007

C. Dunn <cdunn2001 gmail.com> writes:

Jarrett Billingsley Wrote:
 Write your own sort routine which takes a given predicate.  :\
 
 Tango provides such a routine in tango.core.Array; phobos provides no such 
 analogue.  With phobos, either write your own routine, or use 
 std.c.stdlib.qsort.  I'll let you decide which is more elegant. 

It's not just a matter of elegance.  C++ std::sort() is way, way faster than
qsort in cstdlib because the comparison function gets inlined.  Sorting is one
place where inlining is critical.

Aug 14 2007

janderson <askme me.com> writes:

C. Dunn wrote:
 Jarrett Billingsley Wrote:
 Write your own sort routine which takes a given predicate.  :\

 Tango provides such a routine in tango.core.Array; phobos provides no such 
 analogue.  With phobos, either write your own routine, or use 
 std.c.stdlib.qsort.  I'll let you decide which is more elegant. 

 
 It's not just a matter of elegance.  C++ std::sort() is way, way faster than
qsort in cstdlib because the comparison function gets inlined.  Sorting is one
place where inlining is critical.

In my experience/benchmarks (MS VC++ additions of C++ 2002/2003/2005) 
I've found that qsort is marginally faster at sorting then std::sort. 
Now it probably the type of data I was sorting however I don't support 
the way way faster argument.

Note the data I was sorting was pointers, 10-200 elements many times a 
frame.  Also note that std::sort will call the slow copy constructor if 
you use direct objects while qsort can't.

Also MS std::sort is much slower in debug due to all the extra checks 
they put into iterators.

-Joel

Aug 14 2007

Regan Heath <regan netmail.co.nz> writes:

C. Dunn wrote:
 Jarrett Billingsley Wrote:
 Write your own sort routine which takes a given predicate.  :\
 
 Tango provides such a routine in tango.core.Array; phobos provides
 no such analogue.  With phobos, either write your own routine, or
 use std.c.stdlib.qsort.  I'll let you decide which is more elegant.
 

 
 It's not just a matter of elegance.  C++ std::sort() is way, way
 faster than qsort in cstdlib because the comparison function gets
 inlined.  Sorting is one place where inlining is critical.

I believe the intention for D is that the optimizer will automagically 
inline everything it can.

Regan

Aug 15 2007

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

Regan Heath wrote:
 C. Dunn wrote:
 It's not just a matter of elegance.  C++ std::sort() is way, way
 faster than qsort in cstdlib because the comparison function gets
 inlined.  Sorting is one place where inlining is critical.

 
 I believe the intention for D is that the optimizer will automagically
 inline everything it can.
 

I'd imagine it's quite hard to inline a delegate or function pointer. In any
case, the current optimizer is a joke in many ways.

Functions, on the other hand, can be inlined, which is why std::sort and D's
built-in sort beat C's qsort (assuming they use approximately the same
algorithm): they're hard-coded to use "a < b", which is either a "cmp" x86
instruction or an inlineable call to a.opCmp(b) which usually translates to "int
< int" and thus a "cmp" anyway.

This is why I recently suggested
(http://www.dsource.org/projects/tango/ticket/571) that Tango provide for
something like:

enum Dir : bool { Ascending, Descending }

// I forget the correct syntax, sorry if this is wrong
template sort(T, Dir D = Dir.Ascending) {
	bool compare(T a, T b) {
		static if (D == Dir.Ascending)
			return a < b;
		else
			return a > b;
	}
	
	void sort(T[] a) {
		// sort using compare for comparing
	}
}

This allows the user to choose whether to sort in ascending or descending order,
and improves speed noticeably. Currently Tango uses a functor struct:

struct IsLess(T) {
	bool opCall(T a, T b) {
		return a < b;
	}
}

Which is not inlined. A "call" instruction is generated for every comparison.

-- 
Remove ".doesnotlike.spam" from the mail address.

Aug 15 2007

Regan Heath <regan netmail.co.nz> writes:

Deewiant wrote:
 Regan Heath wrote:
 C. Dunn wrote:
 It's not just a matter of elegance.  C++ std::sort() is way, way
 faster than qsort in cstdlib because the comparison function gets
 inlined.  Sorting is one place where inlining is critical.

 I believe the intention for D is that the optimizer will automagically
 inline everything it can.

 
 I'd imagine it's quite hard to inline a delegate or function pointer. In any
 case, the current optimizer is a joke in many ways.

True about delegates/fpointers I reckon.

I don't know why people keep hating on the optimizer.  Isn't it the same 
optimizer used in DMC, a C compiler with 25 odd years of history?  Sure, 
it's not 'optimized' for D specifically but it still holds it's own in 
shootout results in the general sense, right?

 Functions, on the other hand, can be inlined, which is why std::sort and D's
 built-in sort beat C's qsort (assuming they use approximately the same
 algorithm): they're hard-coded to use "a < b", which is either a "cmp" x86
 instruction or an inlineable call to a.opCmp(b) which usually translates to
"int
 < int" and thus a "cmp" anyway.
 
 This is why I recently suggested
 (http://www.dsource.org/projects/tango/ticket/571) that Tango provide for
 something like:
 
 enum Dir : bool { Ascending, Descending }
 
 // I forget the correct syntax, sorry if this is wrong
 template sort(T, Dir D = Dir.Ascending) {
 	bool compare(T a, T b) {
 		static if (D == Dir.Ascending)
 			return a < b;
 		else
 			return a > b;
 	}
 	
 	void sort(T[] a) {
 		// sort using compare for comparing
 	}
 }
 
 This allows the user to choose whether to sort in ascending or descending
order,
 and improves speed noticeably. Currently Tango uses a functor struct:
 
 struct IsLess(T) {
 	bool opCall(T a, T b) {
 		return a < b;
 	}
 }
 
 Which is not inlined. A "call" instruction is generated for every comparison.

Instead of limiting it to opCall why not define an Interface 'Sortable' 
and define a compare method which takes a mode flag and "Sortable rhs" 
or similar.

Regan

Aug 15 2007

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

Regan Heath wrote:
 Deewiant wrote:
 I'd imagine it's quite hard to inline a delegate or function pointer.
 In any
 case, the current optimizer is a joke in many ways.

 
 True about delegates/fpointers I reckon.
 
 I don't know why people keep hating on the optimizer.  Isn't it the same
 optimizer used in DMC, a C compiler with 25 odd years of history?  Sure,
 it's not 'optimized' for D specifically but it still holds it's own in
 shootout results in the general sense, right?
 

It's not _bad_ per se. I understand the problem is difficult and that Walter's
doing it on his own, but I can't help but drool over the thought of an "Intel D
Compiler".

The main problems with the Digital Mars optimizer, IMO, are as follows.

(Disclaimer: I understand that languages have to cater for many different
environments and some of the following may only speed things up on my computer,
while slowing things down on others. Still, there are ample resources on the
'Net regarding the performance of practically all parts of various processors.)

- Almost no D support, as you say. For instance, slices:

array[x .. x + y]
array[x..$][0..y]
array[0..y][x..$]
(&array[x])[0..y]

The above forms are equivalent, yet generate very different code. I feel that
the middle two are clearer expressions of "take y elements starting from x" but
they result in very slow code: it's actually the last one, using a pointer,
that's the fastest.

- Can't inline functions containing assembly language. This is very annoying: if
you have a fast hand-coded micro-optimization you have to manually inline the
function everywhere you use it. Otherwise, you tend to end up creating much
slower code because the function isn't inlined.

- No support for instruction sets beyond the basic x86. (MMX, SSE, 3DNow!, etc.)

- In general, can't (or only rarely can) do stuff like loop unrolling,
elimination of branching/jumps, and "loop invariant code motion" which is the
fancy term for moving something like "x * y * z + a / b - c" from a loop which
just sets all array elements to that value.

It's really annoying to find that writing something as simple as "int tmp = x*2"
prior to a loop and using tmp instead of x*2 can sometimes result in noticeable
speedups.


Some things which might just be me expecting too much, I don't know how good
other optimizers are at this:


- Poor execution path analysis. For instance:

for (size_t i = 5; i < maximum_i; ++i) {
	
	for (size_t j = i; j >= 5; --j)
		// do stuff
}

Here, the inner loop tests for j >= 5. Since j = i, and i goes from 5 up to
maximum_i, the test will always be true. Hence the test could be eliminated for
the first iteration, making the inner loop equivalent to:

size_t j = i;
do {
	// stuff
} while (j-- >= 5);

Another example: assume the loop's contents are simply:

array[j-1] = array[j];

Now, two subtractions are performed. First to get j-1, and then to decrement j
in the while test. The value could be cached, which may speed things up.

(Manually optimizing code combining the above two examples lead to a measurable
speedup on a sort algorithm I was benchmarking yesterday.)

- Doesn't do algebra very well (includes integer, floating point, boolean, and
bit vector algebra). May also be related to the above. For instance:

int x = -y;
int a = x;
int b = a + y;

Here, b is set to "a + y", equivalent to "x + y", equivalent to "-y + y",
equivalent to 0. This kind of stuff has to be caught manually.

Optimizing_cpp.pdf over at http://www.agner.org/optimize/ compares the
optimizations applied by C++ compilers in section 7.2, and it includes DMC. It's
a bit unfair in that the compilers are of different ages, but I don't think
there's been that much progress in any of the compilers anyway.

 Instead of limiting it to opCall why not define an Interface 'Sortable'
 and define a compare method which takes a mode flag and "Sortable rhs"
 or similar.
 

If you just want to sort basic types with varying orderings, wrapping them in a
class is a waste of memory and CPU.

Currently, the routine takes any callable type: something with opCall, or a
function pointer or delegate. The idea is to allow the option of
ascending/descending because those are the two most common cases and result in
major reduction of execution time.

-- 
Remove ".doesnotlike.spam" from the mail address.

Aug 15 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

Deewiant wrote:
 - Almost no D support, as you say. For instance, slices:
 
 array[x .. x + y]
 array[x..$][0..y]
 array[0..y][x..$]
 (&array[x])[0..y]
 
 The above forms are equivalent, yet generate very different code. I feel that

The third one is actually equivalent to array[x..y] and not to the other 
three.

Aug 15 2007

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

Frits van Bommel wrote:
 Deewiant wrote:
 - Almost no D support, as you say. For instance, slices:

 array[x .. x + y]
 array[x..$][0..y]
 array[0..y][x..$]
 (&array[x])[0..y]

 The above forms are equivalent, yet generate very different code.

 
 The third one is actually equivalent to array[x..y] and not to the other
 three.

Good catch, you're right. It should of course be array[0..x+y][x..$].

I prefer the second form, anyway.

-- 
Remove ".doesnotlike.spam" from the mail address.

Aug 15 2007

Regan Heath <regan netmail.co.nz> writes:

Deewiant wrote:
 Frits van Bommel wrote:
 Deewiant wrote:
 - Almost no D support, as you say. For instance, slices:

 array[x .. x + y]
 array[x..$][0..y]
 array[0..y][x..$]
 (&array[x])[0..y]

 The above forms are equivalent, yet generate very different code.

 The third one is actually equivalent to array[x..y] and not to the other
 three.

 
 Good catch, you're right. It should of course be array[0..x+y][x..$].
 
 I prefer the second form, anyway.

I prefer the first form, I find the rest confusing.

Regan

Aug 15 2007

BCS <ao pathlink.com> writes:

Reply to John,

 I have a class that contains two variables. Sometimes I need to sort
 an array of the class based on variable 1, and later in the same
 progam sort the array based on variable 2.
 
 Can someone give me an idea of how to do this (smartly, efficiently,
 cleverly, polymorphically, ...)?
 
 Best idea that I've had so far is to use a static flag in the class to
 indicate which sort is desired, and then to have the overloaded
 opCmp() check the flag before comparing.
 

you could try some dirty hacks:

// what you want to sort
class Foo { int myCmp(Object o){...} }

// struct of same size with my opCmp
struct FooWrap 
{
Foo f;
int opCmp(Object o){return f.myCmp(o);}
}


Foo[] f = ...;

// overlay array with new struct array and sort that
(cast(FooWrap*)f.ptr)[0..f.length].sort

Aug 14 2007

Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:

BCS wrote:
 // overlay array with new struct array and sort that
 (cast(FooWrap*)f.ptr)[0..f.length].sort

That probably looks better as "(cast(FooWrap[]) f).sort", which should 
be equivalent.

Aug 14 2007

BCS <ao pathlink.com> writes:

Reply to Frits,

 BCS wrote:
 
 // overlay array with new struct array and sort that
 (cast(FooWrap*)f.ptr)[0..f.length].sort
 

 That probably looks better as "(cast(FooWrap[]) f).sort", which should
 be equivalent.
 

yes but you might be able to slip that past in a code review (if coffee is 
in short supply). My version will get you tied to a post and shot (as you 
is proper) even if everyone else in the room is asleep!

Aug 14 2007

D Programming

C/C++ Programming

Other

digitalmars.D - How to override opCmp when two different sorts are needed?