digitalmars.D - design question
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Apr 03 2009
- dsimcha <dsimcha yahoo.com> Apr 03 2009
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Apr 03 2009
- Walter Bright <newshound1 digitalmars.com> Apr 03 2009
- Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> Apr 03 2009
- BCS <ao pathlink.com> Apr 03 2009
- Jason House <jason.james.house gmail.com> Apr 03 2009
- "Denis Koroskin" <2korden gmail.com> Apr 03 2009
I'm toying with a rather new approach to a certain problem, and wanted
to ask for some feedback before coding with it.
The STL has two functions with a "stable" variant: stable_sort and
stable_partition. They differ from sort and partition in that they
preserve the order of equivalent objects. They are more costly, hence
the default is "unstable".
I started with the same approach in std.algorithm, but before long I
figured two things:
a) There are many more algorithms that can benefit from stable vs.
unstable versions: remove, topN, schwartzSort, partialSort, completeSort
(take a sorted range and an unsorted one and make a sorted range out of
them), and makeIndex.
b) Aside from stable and unstable, there's a semistable notion: when
partitioning, the left half stays stable and the right half is unstable.
Clearly encoding stability in the function name isn't faring very well,
so I made a policy of it:
enum SwapStrategy { stable, semistable, unstable }
and parameterized the appropriate functions with SwapStrategy. A use
case looks like this:
// even to the left, odd to the right
auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr);
assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]);
assert(p == arr[5 .. $]);
Ok, until now there's only been background :o|. The new part starts here.
I think the stability request could be better encoded as a wrapper of
the range making the request. For example:
auto p = partition!("(a & 1) == 0")(keepStable(arr));
So in order to tell partition you want stability, you just wrap the
range with a call to keepStable. (The runtime cost is negligible).
Similarly, say you want a stable sort. You'd say:
sort(keepStable(arr));
instead of:
sort!(SwapStrategy.stable)(arr);
I have the feeling this approach will scale better in the future. Also
it is less verbose and in sync with the current way of doing searches:
auto f = find(assumeSorted(haystack), needle);
In that case, assumeSorted uses the same means of transmitting
information about a range into an algorithm.
So... I'd appreciate any thoughts you might have. Thanks.
Andrei
Apr 03 2009
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleI think the stability request could be better encoded as a wrapper of the range making the request. For example: auto p = partition!("(a & 1) == 0")(keepStable(arr)); So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say: sort(keepStable(arr)); instead of: sort!(SwapStrategy.stable)(arr);
Can you elaborate a little? In particular, would the KeepStable wrapper actually _do_ anything in itself to make modifications to the underlying range stable, or would it just wrap the range in a new type so that you could stick a static if statement in the sort algorithm to decide what to?
Apr 03 2009
dsimcha wrote:== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s articleI think the stability request could be better encoded as a wrapper of the range making the request. For example: auto p = partition!("(a & 1) == 0")(keepStable(arr)); So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say: sort(keepStable(arr)); instead of: sort!(SwapStrategy.stable)(arr);
Can you elaborate a little? In particular, would the KeepStable wrapper actually _do_ anything in itself to make modifications to the underlying range stable, or would it just wrap the range in a new type so that you could stick a static if statement in the sort algorithm to decide what to?
keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it. Andrei
Apr 03 2009
Andrei Alexandrescu wrote:keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it.
Would these be composable? I.e.: thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
Apr 03 2009
Walter Bright wrote:Andrei Alexandrescu wrote:keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it.
Would these be composable? I.e.: thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
Great question. Not sure whether I can implement composition to be entirely order-invariant, but yes, the algorithm should "see" various attributes of the wrapped range. Andrei
Apr 03 2009
Reply to Andrei,Walter Bright wrote:Andrei Alexandrescu wrote:keepStable is a template function that simply wraps the range in a different type. It's a way to pass information about the range (e.g., "keep this stable", or "this is sorted already"), to the function understanding it.
thisAttribute(thatAttribute(T)) is the same as thatAttribute(thisAttribute(T)) ? Would the algorithm detect the 'inner' attribute?
entirely order-invariant, but yes, the algorithm should "see" various attributes of the wrapped range. Andrei
you could do composition by tuple/set unioning and always sort the set (by .stringof)
Apr 03 2009
I don't like it. You're introducing types for things that should only exist at the call site. Now I can declare variables with those types, and even start composing them together. If a user zips two stable ranges together, they should be pissed if they don't get a stable range out. Even if std.algorithm gets it right, 3rd party libs might not. Additionally, std.algorithm should support others mimicing the same type-based design style. Overall, i don't think it makes implementing sort easier, but it opens a lot of undesirable issues. Andrei Alexandrescu Wrote:I'm toying with a rather new approach to a certain problem, and wanted to ask for some feedback before coding with it. The STL has two functions with a "stable" variant: stable_sort and stable_partition. They differ from sort and partition in that they preserve the order of equivalent objects. They are more costly, hence the default is "unstable". I started with the same approach in std.algorithm, but before long I figured two things: a) There are many more algorithms that can benefit from stable vs. unstable versions: remove, topN, schwartzSort, partialSort, completeSort (take a sorted range and an unsorted one and make a sorted range out of them), and makeIndex. b) Aside from stable and unstable, there's a semistable notion: when partitioning, the left half stays stable and the right half is unstable. Clearly encoding stability in the function name isn't faring very well, so I made a policy of it: enum SwapStrategy { stable, semistable, unstable } and parameterized the appropriate functions with SwapStrategy. A use case looks like this: // even to the left, odd to the right auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr); assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]); assert(p == arr[5 .. $]); Ok, until now there's only been background :o|. The new part starts here. I think the stability request could be better encoded as a wrapper of the range making the request. For example: auto p = partition!("(a & 1) == 0")(keepStable(arr)); So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say: sort(keepStable(arr)); instead of: sort!(SwapStrategy.stable)(arr); I have the feeling this approach will scale better in the future. Also it is less verbose and in sync with the current way of doing searches: auto f = find(assumeSorted(haystack), needle); In that case, assumeSorted uses the same means of transmitting information about a range into an algorithm. So... I'd appreciate any thoughts you might have. Thanks. Andrei
Apr 03 2009
On Fri, 03 Apr 2009 20:53:18 +0400, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:I'm toying with a rather new approach to a certain problem, and wanted to ask for some feedback before coding with it. The STL has two functions with a "stable" variant: stable_sort and stable_partition. They differ from sort and partition in that they preserve the order of equivalent objects. They are more costly, hence the default is "unstable". I started with the same approach in std.algorithm, but before long I figured two things: a) There are many more algorithms that can benefit from stable vs. unstable versions: remove, topN, schwartzSort, partialSort, completeSort (take a sorted range and an unsorted one and make a sorted range out of them), and makeIndex. b) Aside from stable and unstable, there's a semistable notion: when partitioning, the left half stays stable and the right half is unstable. Clearly encoding stability in the function name isn't faring very well, so I made a policy of it: enum SwapStrategy { stable, semistable, unstable } and parameterized the appropriate functions with SwapStrategy. A use case looks like this: // even to the left, odd to the right auto arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; auto p = partition!("(a & 1) == 0", SwapStrategy.stable)(arr); assert(arr == [2, 4, 6, 8, 10, 1, 3, 5, 7, 9]); assert(p == arr[5 .. $]); Ok, until now there's only been background :o|. The new part starts here. I think the stability request could be better encoded as a wrapper of the range making the request. For example: auto p = partition!("(a & 1) == 0")(keepStable(arr)); So in order to tell partition you want stability, you just wrap the range with a call to keepStable. (The runtime cost is negligible). Similarly, say you want a stable sort. You'd say: sort(keepStable(arr)); instead of: sort!(SwapStrategy.stable)(arr); I have the feeling this approach will scale better in the future. Also it is less verbose and in sync with the current way of doing searches: auto f = find(assumeSorted(haystack), needle); In that case, assumeSorted uses the same means of transmitting information about a range into an algorithm. So... I'd appreciate any thoughts you might have. Thanks. Andrei
This sounds interesting, but I don't thing it's good idea overall. assumeSorted(range) is fine, because "sorted" is a property of a range. auto r = assumeSorted(range); // makes sense keepStable(range) is not okay, because stable is not a property of a range. It's a property of an algorithm. auto r = keepStable(range); // makes no sense to me
Apr 03 2009









BCS <ao pathlink.com> 