www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 7128] New: Cartesian product of ranges

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128

           Summary: Cartesian product of ranges
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



It's useful to have a cartesian product of ranges, like a std.range.product:

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=31050

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=31054


See also the handy design of Python itertools.product:

http://docs.python.org/library/itertools.html#itertools.product

itertools.product(*iterables[, repeat]) 
Cartesian product of input iterables.

product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 18 2011
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




https://github.com/D-Programming-Language/phobos/pull/856

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 03 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128





 https://github.com/D-Programming-Language/phobos/pull/856
I guess the Python "repeat" optional argument is not supported:
 from itertools import product
 list(product("abc", repeat=4))
[('a', 'a', 'a', 'a'), ('a', 'a', 'a', 'b'), ('a', 'a', 'a', 'c'), ('a', 'a', 'b', 'a'), ('a', 'a', 'b', 'b'), ...] So you have to write: array(cartesianProduct("abc", "abc", "abc", "abc")) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 03 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




Partially implemented:

http://forum.dlang.org/thread/510f29fd574fd_2193e77ae82545b sh2.rs.github.com.mail


Currently three or more arguments are not accepted:

import std.stdio, std.algorithm;
void main() {
    cartesianProduct([0,1], [0,1], [0,1]).writeln();
}


Another interesting feature of Python itertools.product() that's missing is the
optional "repeat" argument:

 from itertools import product
 list(product([0,1], repeat=3))
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)]
 list(product([0,1], repeat=4))
[(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 0), (0, 0, 1, 1), (0, 1, 0, 0), (0, 1, 0, 1), (0, 1, 1, 0), (0, 1, 1, 1 ), (1, 0, 0, 0), (1, 0, 0, 1), (1, 0, 1, 0), (1, 0, 1, 1), (1, 1, 0, 0), (1, 1, 0, 1), (1, 1, 1, 0), (1, 1, 1, 1)] -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 05 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




Now that the basic 2-argument case is working, I'll start looking into the
multi-argument case.

The simplest implementation is to just alias the variadic version to the
2-argument version chained to a recursive invocation of itself, but this has
the disadvantage that the resulting range will have nested tuples rather than a
single tuple of multiple values at the top-level.

For the repeated case, I'm thinking that with D's compile-time introspection,
we can probably implement:

auto cartesianPower(int n, R)(R range) { ... }

which returns the cartesian product of n copies of the range with. Or maybe:

auto cartesianProduct(R)(R range, int repeat) { ... }

Don't really know which API is better.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 05 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




P.S. on second thoughts, probably the second version is not implementable right
now because we can't compute the type of the return value at runtime.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 05 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128






Thank you for your work. I use cartesian often enough in Python.

 but this has
 the disadvantage that the resulting range will have nested tuples rather than a
 single tuple of multiple values at the top-level.
I think this solution is not good enough. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 05 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128





 P.S. on second thoughts, probably the second version is not implementable right
 now because we can't compute the type of the return value at runtime.
To support this API this cartesianProduct has to return a range of dynamic arrays(this is possible because the types of the items inside the arrays is uniform), but the others return a range of tuples... auto cartesianProduct(R)(R range, int repeat) { ... } -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 05 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




See also Issue 6788

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




*** Issue 6788 has been marked as a duplicate of this issue. ***

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128





 *** Issue 6788 has been marked as a duplicate of this issue. ***
It's not a dupe. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




https://github.com/D-Programming-Language/phobos/pull/1120

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128


Andrei Alexandrescu <andrei erdani.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |andrei erdani.com
         Resolution|                            |FIXED



15:01:41 PST ---
https://github.com/D-Programming-Language/phobos/pull/1120

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




The repetition number is not yet supported:

auto cartesianProduct(size_t nRepetitions=0, R)(R range) { ... }

If the name if the function is different, then it's even possible to support
this API, where it returns a range of dynamic arrays:

auto cartesianPower(R)(R range, in size_t n) { ... }

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128


hsteoh quickfur.ath.cx changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|FIXED                       |



The power must be a compile-time parameter, otherwise it won't compile. The
variadic cartesianProduct only works if the number of arguments is known at
compile-time.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 07 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128





 The power must be a compile-time parameter, otherwise it won't compile.
I think it's possible to create a range like this: auto cartesianPower(R)(R range, in size_t n) { ... } If you call it with: int n = 3; auto result = cartesianPower([0, 1], n).array(); result should be: [[0, 0, 0], [1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1], [1, 0, 1], [0, 1, 1], [1, 1, 1]] One use case: generate dstrings (dchar[][]) for a "Bulls and Cows" game, where they need to be composed of N distinct nonzero digits: dchar[][] genCases(in size_t nDigits) { return cartesianPower("123456789"d.dup, nDigits) .filter!(a => a.sort().uniq().walkLength() == nDigits)() .array(); } So genCases(4).writeln() should print: ["4321", "5321", "6321", "7321", "8321", "9321", "3421", "5421", "6421", "7421", "8421", "9421", "3521", "4521", "6521", "7521", "8521", "9521", "3621", "4621", "5621", "7621", "8621", "9621", "3721", "4721", "5721", "6721", "8721", "9721", "3821", "4821", "5821", "6821", "7821", "9821", "3921", "4921", "5921", ...] -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 08 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




Hmm, you're right. The cartesian power has the advantage that the output will
be tuples of homogenous elements, so we can use arrays instead of tuples, which
will allow runtime variation.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 08 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128





 Hmm, you're right. The cartesian power has the advantage that the output will
 be tuples of homogenous elements, so we can use arrays instead of tuples, which
 will allow runtime variation.
Right. On the other hand if we do this, cartesianProduct() and cartesianPower() will have different type (this means: they will be ranges that yield different class of types, the first tuples, the second dynamic arrays). This requires people that want to use such functions to remember this difference (a broken symmetry). So this design decision has a little associated cost. But in the end I prefer cartesianPower() to yield dynamic arrays because dynamic arrays are more flexible, they offer both the size_t n to be a run-time value, and dynamic arrays are more usable than tuples in D (I think tuples are not ranges). Python itertools.cartesian doesn't have such problems because Python is dynamically typed, so the items that itertools.cartesian are usable like normal sequences (unlike D tuples). If you use a cartesianProduct() to produce the data for the "Bulls and cows" game you write something like: auto d9 = "123456789"d.dup; auto choices = cartesianProduct(d9, d9, d9, d9) .map!(t => [t.tupleof])() .filter!(a => a.sort().uniq().walkLength() == 4)() .array(); The map!(t => [t.tupleof])() is used to convert a tuple into a dynamic array, so later on it you can perform more processing, with a filter() and an array(). D tuples don't play very well with ranges. - - - - - - - - - - - Maybe std.typecons.Tuple should define a handy std.typecons.Tuple.arrayof attribute (that returns [this.tupleof]) when all the items of the tuple have the same type. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 09 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=7128




Extra on cartesianProduct: maybe you should add a little benchmark, to be sure
the way you have implemented the variadic cartesianProduct() (with the
flattening of nested tuples) is efficient/fast enough:

enum N = 10; // Change me.
int[] items = iota(N).array();
immutable len = cartesianProduct(items, items, items, items,
items).walkLength();

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 09 2013