www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 6788] New: std.range.pairwise?

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6788

           Summary: std.range.pairwise?
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



This D2 program contains a very common iteration patten:


import std.stdio, std.typecons;
void main() {
  auto data = [1, 2, 3, 4];
  foreach (i, x1; data)
    foreach (x2; data[i+1 .. $])
      writeln(tuple(x1, x2)); // do something with x1 and x2
}


It prints (dmd 2.056head):

Tuple!(int,int)(1, 2)
Tuple!(int,int)(1, 3)
Tuple!(int,int)(1, 4)
Tuple!(int,int)(2, 3)
Tuple!(int,int)(2, 4)
Tuple!(int,int)(3, 4)


As you see it is quite different from "zip".

So I suggest to add a std.range.pairwise range that does the same thing (same
output):


import std.stdio, std.range;
void main() {
    auto data = [1, 2, 3, 4];
    foreach (tup; pairwise(data))
        writeln(tup);
}


This coding pattern is very common. It happens when you have a sequence of
items, and you want to work (apply a function) on all the pairs, and your
diadic function (function with two arguments) doesn't care for its arguments
order (symmetric function).

Often O(n ** 2) algorithms have to see all the pairs. But often the diadic
function is symmetric. So you need to scan only half of the matrix, an upper
triangle, or lower one. This is where pairwise is useful for. It's not a
necessary range (even zip is not necessary in D), because two loops are enough
to replace it, but it helps a bit because it avoids mistakes with the array
indexes. I have done similar mistakes more than once in past.

pairwise(sequence) is mostly meant to be used with a random-access input
sequence. It's not hard to make it work with forward ranges too, but it becomes
less efficient.


In Python I use a similar generator, defined just like this:


from itertools import combinations
def pairwise(seq):
    return combinations(seq, 2)


Usage examples:

 list( pairwise([]) )
[]
 list(pairwise([1,2]))
[(1, 2)]
 list( pairwise("abc") )
[('a', 'b'), ('a', 'c'), ('b', 'c')]
 list(pairwise([0, 1, 2, 3]))
[(0, 1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)]
 list(pairwise(range(5)))
[(0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)] I expect Phobos to eventually have two very useful lazy ranges that generate permutations and combinations. But pairwise is supposed to be very efficient, here I'd like the compiler to generate code similar to two loops inside each other, like in the first code example. This is probably hard to do if pairwise is implemented with a call to a combinations() function. Why is pairwise just about pairs? Isn't a more general solution better? [20:33] bearophile A general function that takes n ranges and yields all possible unsorted n-tuples is possible, but in practice this is a very uncommon thing in my code. This is why I have limited it to pairs. It is just about pairs because this is the most common case. A potential problem: some users might wrongly think pairwise(sequence) is the same as std.range.chunks(sequence,2). The documentation has to specify they are two quite different things, to avoid some of such mistakes. This is an alternative design, similar to how lockstep works: import std.stdio, std.range; void main() { auto data = [1, 2, 3, 4]; foreach (x1, x2; pairwise(data)) writeln(x1, " ", x2); } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Oct 07 2011
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6788




See also Issue 7128

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6788


hsteoh quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |hsteoh quickfur.ath.cx
         Resolution|                            |DUPLICATE



This is the same as the cartesianProduct of two ranges, which is already
checked into git HEAD.

*** This issue has been marked as a duplicate of issue 7128 ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6788


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|DUPLICATE                   |




 This is the same as the cartesianProduct of two ranges, which is already
 checked into git HEAD.
 
 *** This issue has been marked as a duplicate of issue 7128 ***
See this test code: import std.stdio, std.algorithm; void main() { auto data = [1, 2, 3, 4]; foreach (xy; cartesianProduct(data, data)) writeln(xy); } Tuple!(int, int)(1, 1) Tuple!(int, int)(2, 1) Tuple!(int, int)(3, 1) Tuple!(int, int)(4, 1) Tuple!(int, int)(1, 2) Tuple!(int, int)(2, 2) Tuple!(int, int)(3, 2) Tuple!(int, int)(4, 2) Tuple!(int, int)(1, 3) Tuple!(int, int)(2, 3) Tuple!(int, int)(3, 3) Tuple!(int, int)(4, 3) Tuple!(int, int)(1, 4) Tuple!(int, int)(2, 4) Tuple!(int, int)(3, 4) Tuple!(int, int)(4, 4) import std.stdio, std.range; void main() { auto data = [1, 2, 3, 4]; foreach (tup; pairwise(data)) writeln(tup); } Should print: Tuple!(int,int)(1, 2) Tuple!(int,int)(1, 3) Tuple!(int,int)(1, 4) Tuple!(int,int)(2, 3) Tuple!(int,int)(2, 4) Tuple!(int,int)(3, 4) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 07 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6788




Basic version for a random access range:


import std.range: ForeachType, isRandomAccessRange, hasLength;
import std.traits: Unqual, isNarrowString;
import std.typecons: Tuple;

struct Pairwise(Range) {
    alias R = Unqual!Range;
    alias Pair = Tuple!(ForeachType!R, "a", ForeachType!R, "b");
    R _input;
    size_t i, j;

    this(Range r_) {
        this._input = r_;
        j = 1;
    }

     property bool empty() {
        return j >= _input.length;
    }

     property Pair front() {
        return Pair(_input[i], _input[j]);
    }

    void popFront() {
        if (j >= _input.length - 1) {
            i++;
            j = i + 1;
        } else {
            j++;
        }
    }
}

Pairwise!Range pairwise(Range)(Range r)
if (isRandomAccessRange!Range && hasLength!Range && !isNarrowString!Range) {
    return typeof(return)(r);
}

import std.stdio: writeln;

void main() {
    (new int[0]).pairwise.writeln;
    [10].pairwise.writeln;
    [10, 20].pairwise.writeln;
    [10, 20, 30].pairwise.writeln;
    [10, 20, 30, 40].pairwise.writeln;
}



Once combinations() is implemented that code can be replaced with something
like:

auto pairwise(Range)(Range r) if (...) {
    alias R = Unqual!Range;
    alias Pair = Tuple!(ForeachType!R, "a", ForeachType!R, "b");    
    return combinations(r, 2).map(t => Pair(t.tupleof));
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 12 2013