www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - C++'s std::rotate

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Hello,


In the discussion of iterators vs. ranges in D, ranges "won" in the 
sense that there hasn't been strong evidence of a need for iterators to 
complement ranges; people have merrily used std.algorithm and friends 
and all was good.

However, it has been long acknowledged that iterators are sometimes more 
expressive/economical than ranges, in the same way pointers are more so 
than slices: iterators are a lower-level building block so they can be 
used to build ranges and also other things. There's an inclusion 
relationship between what can be done with iterators and what can be 
done with ranges.

Amid these observations, C++'s std::rotate (http://goo.gl/z8FjV) and 
derivative algorithms have been a prime category of examples. C++'s 
std::rotate takes a "three-legged range", i.e. three iterators begin, 
middle, and end, and rotates the range [middle, end) to precede [begin, 
middle). It returns (as of C++11) the new middle. For example, given the 
range [0, 1, 2, 3, 4, 5] with begin -> 0, mid -> 2, and end -> just past 
5, rotate rearranges elements as [2, 3, 4, 5, 0, 1, 2] and returns a 
pointer to the new position of 0.

This is a very challenging algorithm for ranges because it's difficult 
to express both the input and the output appropriately. My first insight 
into the matter has been that the ranges [begin, middle) and [middle, 
end) don't even need to be adjacent, so I defined bringToFront 
(http://goo.gl/5HUCiV) as a slightly different take on std::rotate. It 
rotates any two ranges, adjacent or not, and returns the number of 
elements in the right-hand range.

That's in a way more general than std::rotate but at the same time less 
satisfactory because it returns only a number, giving no access to the 
new positions of the two elements.

Following an exchange with Eric Niebler I've scored one more 
mini-breakthrough today: a variant of std::rotate that rivals C++'s one, 
without needing iterators. The only abstraction needed is 
r1.sameFront(r2), which returns true iff the forward ranges r1 and r2 
have fronts pointing to the same element. The function can be 
implemented generically for ranges that offer front as a reference:

bool sameFront(R1, R2)(R1 r1, R2 r2) {
   return &r1.front == &r2.front;
}

Ranges that offer rvalues from front (proxying etc) must implement 
sameFront as a primitive.

Implementation is at http://dpaste.dzfl.pl/a0effbaee0a9. For historical 
reasons I've reused an undocumented function sameHead. It has the 
meaning of sameFront above. Signature is:

Tuple!(typeof(whole.takeExactly(0)), R)
rotateTail(R)(R whole, R right);

The algorithm assumes that "right" is a subrange of "whole" sitting at 
its tail, i.e. calling while.popFront a number of times will make 
whole.sameFront(right) true. It moves right to the front of whole and 
returns (here's the beauty of it) the new positions of the two subranges.

For example, say whole = [0, 1, 2, 3, 4, 5] and right = whole[4 .. $]. 
Then rotateTail(whole, right) makes whole = [4, 5, 0, 1, 2, 3] and 
returns tuple(whole[0 .. 2], whole[2 .. $]).

An essential role in this all is takeExactly (just like take, but knows 
the length in O(1)). It's been an essential component in a few 
range-based algorithms and it seems an important abstraction that closes 
the circle on rotateTail as well, almost too fittingly. Eric mentioned 
that he independently saw a need for it and defined what he calls 
SizedIteratorRange.


Andrei
Aug 10 2014
next sibling parent reply "Dragos Carp" <dragoscarp gmail.com> writes:
Nice work!

 Implementation is at http://dpaste.dzfl.pl/a0effbaee0a9. For 
 historical reasons I've reused an undocumented function 
 sameHead.
sameHead is documented. I already use it a couple of times.
 The algorithm assumes that "right" is a subrange of "whole" 
 sitting at its tail, ...
sameTail is also in the Phobos docs. overlap is not documented. A couple of times writing contracts, I missed something like: bool sliceOf(T)(in T[] whole, in T[] slice) { return whole.ptr <= slice.ptr && whole.ptr + slice.length <= whole.ptr + slice.length; } The alternative using overlap: overlap(whole, slice) is slice is a little bit too expensive for my use case.
Aug 10 2014
next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Monday, 11 August 2014 at 06:56:52 UTC, Dragos Carp wrote:
 bool sliceOf(T)(in T[] whole, in T[] slice)
 {
     return whole.ptr <= slice.ptr &&
         whole.ptr + slice.length <= whole.ptr + slice.length;
 }
Shouldn't the function arguments of sliceOf be reversed to given a more intuitive UCFS as if (slice.sliceOf(whole) { ... } ?
Aug 11 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/11/14, 2:11 AM, "Nordlöw" wrote:
 On Monday, 11 August 2014 at 06:56:52 UTC, Dragos Carp wrote:
 bool sliceOf(T)(in T[] whole, in T[] slice)
 {
     return whole.ptr <= slice.ptr &&
         whole.ptr + slice.length <= whole.ptr + slice.length;
 }
Shouldn't the function arguments of sliceOf be reversed to given a more intuitive UCFS as if (slice.sliceOf(whole) { ... }
isSliceOf -> yum
Aug 11 2014
next sibling parent "Dragos Carp" <dragoscarp gmail.com> writes:
 isSliceOf -> yum
PR created: https://github.com/D-Programming-Language/phobos/pull/2416
Aug 11 2014
prev sibling next sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Monday, 11 August 2014 at 14:45:09 UTC, Andrei Alexandrescu 
wrote:
isSliceOf -> yum

Can you elaborate on that?
Aug 11 2014
parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Monday, 11 August 2014 at 18:11:19 UTC, Nordlöw wrote:
 On Monday, 11 August 2014 at 14:45:09 UTC, Andrei Alexandrescu 
 wrote:
 isSliceOf -> yum

 Can you elaborate on that?
I get it, Andrei :)
Aug 11 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/11/14, 11:12 AM, "Nordlöw" wrote:
 On Monday, 11 August 2014 at 18:11:19 UTC, Nordlöw wrote:
 On Monday, 11 August 2014 at 14:45:09 UTC, Andrei Alexandrescu wrote:
 isSliceOf -> yum

 Can you elaborate on that?
I get it, Andrei :)
Yah, all about making "a.isSliceOf(b)" a nice phrase and keeping the tradition of other isXxx names in Phobos. -- Andrei
Aug 11 2014
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Monday, 11 August 2014 at 14:45:09 UTC, Andrei Alexandrescu 
wrote:
 On 8/11/14, 2:11 AM, "Nordlöw" wrote:
 On Monday, 11 August 2014 at 06:56:52 UTC, Dragos Carp wrote:
 bool sliceOf(T)(in T[] whole, in T[] slice)
 {
    return whole.ptr <= slice.ptr &&
        whole.ptr + slice.length <= whole.ptr + slice.length;
 }
Shouldn't the function arguments of sliceOf be reversed to given a more intuitive UCFS as if (slice.sliceOf(whole) { ... }
isSliceOf -> yum
While "sameHead" and "sameTail" *could* have a "good enough" generic implementation for ranges, there is absolutely no way to make "isSliceOf" or "overlap" work for a generic range. That said, sameHead and sameTail is just the iterator equivalent of "first1 == first2" and "last1 == last2", which is used a lot with iterators. You rarely see operator "<" used with iterators though, so I have doubts about why those two functions (isSliceOf and overlap) would actually be of any use.
Aug 22 2014
prev sibling parent reply =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Monday, 11 August 2014 at 06:56:52 UTC, Dragos Carp wrote:
 bool sliceOf(T)(in T[] whole, in T[] slice)
 {
     return whole.ptr <= slice.ptr &&
         whole.ptr + slice.length <= whole.ptr + slice.length;
 }
Correction: This is what I think you mean: bool sliceOf(T)(in T[] part, in T[] whole) { return (whole.ptr <= part.ptr && part.ptr + part.length <= whole.ptr + whole.length); }
Aug 11 2014
parent reply "Dragos Carp" <dragoscarp gmail.com> writes:
On Monday, 11 August 2014 at 10:09:53 UTC, Nordlöw wrote:
 On Monday, 11 August 2014 at 06:56:52 UTC, Dragos Carp wrote:
 bool sliceOf(T)(in T[] whole, in T[] slice)
 {
    return whole.ptr <= slice.ptr &&
        whole.ptr + slice.length <= whole.ptr + slice.length;
 }
Correction: This is what I think you mean: bool sliceOf(T)(in T[] part, in T[] whole) { return (whole.ptr <= part.ptr && part.ptr + part.length <= whole.ptr + whole.length); }
Yes, of course. I had lhs, rhs and messed up the renaming of those.
Aug 11 2014
parent reply "Dragos Carp" <dragoscarp gmail.com> writes:
 Correction: This is what I think you mean:

 bool sliceOf(T)(in T[] part,
                in T[] whole)
 {
    return (whole.ptr <= part.ptr &&
            part.ptr + part.length <=
            whole.ptr + whole.length);
 }
Yes, of course. I had lhs, rhs and messed up the renaming of those.
https://github.com/dcarp/phobos/compare/sliceOf
Aug 11 2014
parent =?UTF-8?B?Ik5vcmRsw7Z3Ig==?= <per.nordlow gmail.com> writes:
On Monday, 11 August 2014 at 11:00:41 UTC, Dragos Carp wrote:
 https://github.com/dcarp/phobos/compare/sliceOf
Why not use something like "part" and "whole" instead of "lhs" and "rhs"? It is more self-documenting.
Aug 11 2014
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
This reminds me, we still need "allBefore" to implement 
nextPermutation correctly for bidirectional ranges.

https://issues.dlang.org/show_bug.cgi?id=12188

I think this would help here also.
Aug 11 2014
prev sibling next sibling parent reply Ary Borenszweig <ary esperanto.org.ar> writes:
On 8/11/14, 12:29 AM, Andrei Alexandrescu wrote:
 Hello,
In which algorithms would one use std::rotate?
Aug 11 2014
next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 11 August 2014 at 13:55:07 UTC, Ary Borenszweig wrote:
 On 8/11/14, 12:29 AM, Andrei Alexandrescu wrote:
 Hello,
In which algorithms would one use std::rotate?
http://en.wikipedia.org/wiki/Block_sort
Aug 11 2014
prev sibling next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Monday, 11 August 2014 at 13:55:07 UTC, Ary Borenszweig wrote:
 On 8/11/14, 12:29 AM, Andrei Alexandrescu wrote:
 Hello,
In which algorithms would one use std::rotate?
Pushing N items to the front of a vector is implemented as pushing N to the back then rotating them to the front.
Aug 11 2014
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/11/14, 6:55 AM, Ary Borenszweig wrote:
 On 8/11/14, 12:29 AM, Andrei Alexandrescu wrote:
 Hello,
In which algorithms would one use std::rotate?
Depends on whom you ask :o). I think it's a fairly obscure algorithm, better suited as representative of a class rather than frequently useful. Facebook's C++ code base only has few uses, all in the same pattern rotate(b, b + 1, e) i.e. bubble the front all the way to the back with random iterators. (The result is not used and is trivially e - 1.) (One frequent use is in ranges/iterators debates; out of the woodwork come people whose livelihood apparently depends on rotate, in particular on using rotate with non-random iterators (random access ranges make it easy to implement) AND needing the result.) I do think that, like partition or partialSort, std::rotate is one of those algorithms that's good to know about because it may save some otherwise awkward solutions. Sean Parent has a nice talk http://channel9.msdn.com/Events/GoingNative/2013/Cpp-Seasoning on such. For my money I think http://dpaste.dzfl.pl/a0effbaee0a9 is fine (after being of course productionized to take advantage of random access if available, improve on the recursion, etc). Requiring sameHead or random access seems to me a good sweet spot. Andrei
Aug 11 2014
parent reply "Fool" <fool dlang.org> writes:
On Monday, 11 August 2014 at 15:04:52 UTC, Andrei Alexandrescu 
wrote:
 http://dpaste.dzfl.pl/a0effbaee0a9
Is your design final with respect to handling degenerate cases? What do you think about http://dpaste.dzfl.pl/8fc83cb06de8 ?
Aug 17 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/17/14, 9:06 AM, Fool wrote:
 On Monday, 11 August 2014 at 15:04:52 UTC, Andrei Alexandrescu wrote:
 http://dpaste.dzfl.pl/a0effbaee0a9
Is your design final with respect to handling degenerate cases? What do you think about http://dpaste.dzfl.pl/8fc83cb06de8 ?
Yah, it's a good option. Relevant code: if (right is whole) { //writeln("case identical"); return tuple(right, whole.dropExactly(whole.length)); } (Prolly it would be better to use "whole.sameHead(right)" instead of "right is whole". Also whole may not define length so the actual algo is just a tad different.) Trouble is it takes O(n) for that trivial case and for what may be in all likelihood a useless return (iterate right all the way through the end and return the empty balance). On the bright side, client code does have the option to make the test before calling rotateTail so in a way the function is more consistent/complete while still leaving the caller the option to avoid the corner case. Not sure what's the best call here yet. Loosely related, Xinok wrote a nice piece on in-place merge: http://goo.gl/AS2P4A. A great use of rotateTail. Andrei
Aug 17 2014
parent reply "Fool" <fool dlang.org> writes:
On Monday, 18 August 2014 at 05:27:11 UTC, Andrei Alexandrescu 
wrote:
 On 8/17/14, 9:06 AM, Fool wrote:
  http://dpaste.dzfl.pl/8fc83cb06de8
Yah, it's a good option. Relevant code: if (right is whole) { //writeln("case identical"); return tuple(right, whole.dropExactly(whole.length)); } (Prolly it would be better to use "whole.sameHead(right)" instead of "right is whole". Also whole may not define length so the actual algo is just a tad different.)
A cleaned-up version is here: http://dpaste.dzfl.pl/a8c36a0c36d0
 Trouble is it takes O(n) for that trivial case and for what may 
 be in all likelihood a useless return (iterate right all the 
 way through the end and return the empty balance).
Yah, on the other hand you can only expect to get what you pay for. The strategy that naturally arises from the basic algorithm is even worse. In the case at hand, it suggests executing n trivial swaps.
 Loosely related, Xinok wrote a nice piece on in-place merge: 
 http://goo.gl/AS2P4A. A great use of rotateTail.
Thanks for that link. It's a nice writeup. -- Fool
Aug 19 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/19/14, 12:32 PM, Fool wrote:
 On Monday, 18 August 2014 at 05:27:11 UTC, Andrei Alexandrescu wrote:
 On 8/17/14, 9:06 AM, Fool wrote:
  http://dpaste.dzfl.pl/8fc83cb06de8
Yah, it's a good option. Relevant code: if (right is whole) { //writeln("case identical"); return tuple(right, whole.dropExactly(whole.length)); } (Prolly it would be better to use "whole.sameHead(right)" instead of "right is whole". Also whole may not define length so the actual algo is just a tad different.)
A cleaned-up version is here: http://dpaste.dzfl.pl/a8c36a0c36d0
FWIW just added https://issues.dlang.org/show_bug.cgi?id=13335 Andrei
Aug 19 2014
parent reply "Fool" <fool dlang.org> writes:
After having some fun with rotateTail and friends a draft 
implementation [2] is now available and ready for comments aiming 
at formal specification, algorithmics, and implementation details.

While most ideas were taken from [1], all mistakes are mine. 
Since this is my first D code, more or less, please take it with 
a grain of salt and don't hesitate to express the obvious.

Summary

The original recursive algorithm applicable to ForwardRange 
experienced another pass [B]. It was put out of contention by an 
iterative transcription [D] which turned out to be notably faster.

For bidirectional iterators, there is a beautiful reduction to 
three calls of the reverse function. It was not hard to come up 
with a corresponding version for BidirectionalRange [E].

Finally, I looked at a third approach which prefers moving to 
swapping. My implementation [C], however, does not seem to be 
competitive in practice.

Compared to their iterator analogues, the class of range based 
rotation algorithms considered comprise a certain amount of 
additional cost. Practice will show whether it is relevant. On 
the other hand it seems to me that this cost could be minimized 
if ranges were based on iterators.

At this point, I refrain from posting results of my biased 
performance tests. It is safe to assume that the implementation 
can be tuned based on profiling information.

All algorithms preliminarily handle degenerate cases as proposed 
by myself. I believe now that rotateTail(r, r) should return 
r.popFrontAll [A] if and only if it is not more expensive than 
returning some empty range. Is there a way to check whether 
r.popFrontExactly(r.walkLength) is O(1)?

Fool

--

[1] A. Stepanov: Rotate - Lecture 19. in Notes on Programming, 
2007, pp.154--167
     URL http://www.stepanovpapers.com/notes.pdf

[2] http://dpaste.dzfl.pl/514a1ef77d0a

[A] popFrontAll, ll.47ff
[B] rotateTailRecursive, ll.75ff
[C] rotateTailCycle, ll.147ff
[D] rotateTail, ll.227ff, and rotateTailAux, ll.325ff
[E] rotateTail, ll.227ff, and rotateTailAux, ll.407ff
Aug 23 2014
parent reply "Fool" <fool dlang.org> writes:
On Saturday, 23 August 2014 at 18:07:43 UTC, Fool wrote:
 [2] http://dpaste.dzfl.pl/514a1ef77d0a
The 'three reverses' algorihm could be actually improved. Updated code (along with additional unit tests and a strengthened assertion) is available at [3]. I compared this implementation to an iterator based C++ translation and std::rotate [4]. Considering C++ std::vector and D dynamic array, here are the results: compiler | algorithm | execution time ----------+--------------------------+---------------- clang++ | std::rotate | 1.62s clang++ | my_rotate / std::reverse | 1.45s clang++ | my_rotate / my_reverse | 0.58s ldc2 | rotateTail | 1.64s g++ | std::rotate | 0.57s g++ | my_rotate / std::reverse | 1.43s g++ | my_rotate / my_reverse | 0.36s gdc | rotateTail | 0.38s Here, my_reverse is an adaption of D's reverse for RandomAccessRange. These results make me think that the implementation proposed for RandomAccessRange is not too bad. It needs to be investigated why, in this particular situation, ldc2 produces significantly slower code than gdc and clang. System: GNU/Linux x86_64 Compiler versions: clang version 3.4.2 LDC - the LLVM D compiler (0.14.0): based on DMD v2.065 and LLVM 3.4.2 g++ (GCC) 4.9.1 gdc (GCC) 4.9.1 Compiler flags: clang++ -std=c++11 -O3 -march=native ldc2 -O3 -mcpu=native -release -disable-boundscheck g++ -std=c++11 -O3 -march=native gdc -O3 -march=native -frelease -fno-bounds-check [3] http://dpaste.dzfl.pl/b8e47941a007 [4] C++ translation of rotateTail for RandomAccessRange: template <typename TIterator> void my_reverse ( TIterator b, TIterator e ) { auto steps = std::distance(b, e) / 2; if (steps) { auto l = b; auto r = std::prev(e); do { std::iter_swap(l, r); ++l; --r; } while (--steps); } } template <typename TIterator> TIterator my_rotate ( TIterator b, TIterator m, TIterator e ) { if (m == e) return b; if (b == m) return e; // my_reverse(b, m); std::reverse(b, m); auto s = std::next(b, std::distance(m, e)); // my_reverse(b, e); std::reverse(b, e); // my_reverse(s, e); std::reverse(s, e); return s; }
Aug 24 2014
parent "Fool" <fool dlang.org> writes:
There was a bug in my C++ translation of rotateTail. The only 
significant change in execution time is marked below:

On Sunday, 24 August 2014 at 09:20:59 UTC, Fool wrote:
  compiler | algorithm                | execution time
 ----------+--------------------------+----------------
  clang++  | std::rotate              | 1.62s
  clang++  | my_rotate / std::reverse | 1.44s
  clang++  | my_rotate / my_reverse   | 0.38s <-
  ldc2     | rotateTail               | 1.64s
  g++      | std::rotate              | 0.57s
  g++      | my_rotate / std::reverse | 1.43s
  g++      | my_rotate / my_reverse   | 0.37s
  gdc      | rotateTail               | 0.38s
The fixed implementation of my_rotate:
 template <typename TIterator>
 TIterator my_rotate
 (
    TIterator b,
    TIterator m,
    TIterator e
 )
 {
    if (m == e) return b;
    if (b == m) return e;

 //   my_reverse(m, e); // <-
    std::reverse(m, e); // <-

    auto s = std::next(b, std::distance(m, e));

 //   my_reverse(b, e);
    std::reverse(b, e);
 //   my_reverse(s, e);
    std::reverse(s, e);

    return s;
 }
Aug 24 2014
prev sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Monday, 11 August 2014 at 03:29:56 UTC, Andrei Alexandrescu 
wrote:
[...] can be implemented generically for ranges that offer
 front as a reference:

 bool sameFront(R1, R2)(R1 r1, R2 r2) {
   return &r1.front == &r2.front;
 }
This doesn't work for ranges that visit the same element twice, e.g. cycle(arr).take(arr.length + 1) [0, 0].map!(i => arr[i]) I suspect most ranges will have to implement the sameFront primitive manually, usually forwarding to the underlying range. Related: most mutating algorithms won't work for these kinds of ranges, as we usually presume lvalue ranges never visit the same lvalue twice. Perhaps this needs to be mentioned on affected algorithms?
Aug 18 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/18/14, 1:44 AM, Peter Alexander wrote:
 On Monday, 11 August 2014 at 03:29:56 UTC, Andrei Alexandrescu wrote:
 [...] can be implemented generically for ranges that offer
 front as a reference:

 bool sameFront(R1, R2)(R1 r1, R2 r2) {
   return &r1.front == &r2.front;
 }
This doesn't work for ranges that visit the same element twice, e.g. cycle(arr).take(arr.length + 1) [0, 0].map!(i => arr[i])
Correct. Though maybe that's a good thing - rotating a cycle is unlikely to produce something interesting.
 I suspect most ranges will have to implement the sameFront primitive
 manually, usually forwarding to the underlying range.
Only those that return rvalues from front.
 Related: most mutating algorithms won't work for these kinds of ranges,
 as we usually presume lvalue ranges never visit the same lvalue twice.
 Perhaps this needs to be mentioned on affected algorithms?
Looks like a benign effect to me. Andrei
Aug 18 2014