www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - next_permutation and cartesian product for ranges?

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
Recently I've been working on some computational geometry code, and
noticed that Phobos doesn't have any equivalent of STL's
next_permutation. I saw there were some discussions about this back in
2010 -- did anything come of it?

If there is any interest in next_permutation, I'd also like to propose a
next_even_permutation for ranges without duplicate elements. There is a
simple way of extending Narayana Pandita's in-place algorithm for
enumerating all permutations such that only even permutations are
returned.

//

Also, I'm interested in Cartesian products of n ranges. Cartesian
products tend to be too long to explicitly construct, so it lends itself
perfectly to a range interface where the elements of the product are
computed on the fly.

Now I saw in the dlang.org search results that Cartesian products have
been discussed before, but got roadblocked because the naïve
implementation of Cartesian product (simply enumerate the first range,
then popFront the second range when the first runs out and gets restored
to the initial .save point, etc.) doesn't lend itself well to infinite
ranges.

Here's my proposed solution: we can use a trick that mathematicians use
in set theory to prove that the cardinality of the rationals is equal to
the cardinality of the natural numbers. Make a table of infinite width
and height, with rows representing one copy of the natural numbers and
columns another copy, then enumerate all elements by enumerating
diagonals. This produces a one-to-one correspondence between the natural
numbers and pairs of natural numbers, of which the rationals are a
subset (thus one has |N| ≤ |Q| ≤ |N|, proving that |N| = |Q|). For
ranges, we don't care about eliminating duplicates like 2/4 = 1/2, since
we want all possible pairs. So the infinite table looks like:

	(0,0) (1,0) (2,0) (3,0) (4,0) ...
	(0,1) (1,1) (2,1) (3,1) ...
	(0,2) (1,2) (2,2) ...
	(0,3) (1,3) ...
	(0,4) ...
	...

The diagonals are therefore:

	(0,0)
	(0,1) (1,0)
	(0,2) (1,1) (2,0)
	(0,3) (1,2) (2,1) (3,0)
	(0,4) (1,3) (2,2) (3,1) (4,0)
	...

The characteristic of the n'th diagonal is that the sum of elements in
each diagonal is equal to n. For example, the zeroth diagonal has only
(0,0) since that's the only pair that sums to 0. The first diagonal is
(0,1), (1,0): each pair sums to 1. The second diagonal: each pair sums
to 2, and so on. The general form for the i'th element of the n'th
diagonal is therefore (i, n-i), for 0 ≤ i ≤ (n-1).

So to enumerate the Cartesian product of two infinite ranges, we
enumerate its diagonals by interpreting the above pairs as indices into
the respective ranges. So, the first range can be a forward range and
the second bidirectional (since for each diagonal we need to enumerate
the second range from the n'th element down to the first). Of course, if
the first range is bidirectional and the second is forward, we can just
swap them and reverse the pairs generated.

If one or both of the ranges is finite, then we can fall back to the
naïve method of enumerating the Cartesian product, in which case we can
relax the requirement on the ranges to be just one forward range and one
input range (with the infinite range being the input range, if one of
them is infinite).

To support the Cartesian product of n ranges, we just compose multiple
binary Cartesian products.

Comments? :)


T

-- 
Many open minds should be closed for repairs. -- K5 user
Oct 09 2012
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/12 1:52 PM, H. S. Teoh wrote:
 Recently I've been working on some computational geometry code, and
 noticed that Phobos doesn't have any equivalent of STL's
 next_permutation. I saw there were some discussions about this back in
 2010 -- did anything come of it?

 If there is any interest in next_permutation, I'd also like to propose a
 next_even_permutation for ranges without duplicate elements. There is a
 simple way of extending Narayana Pandita's in-place algorithm for
 enumerating all permutations such that only even permutations are
 returned.

Yes, we need next_permutation. It will be up to you to convince the Grand Inquisition Committee of Reviewers that next_even_permutation is necessary and/or sufficient.
 //

 Also, I'm interested in Cartesian products of n ranges. Cartesian
 products tend to be too long to explicitly construct, so it lends itself
 perfectly to a range interface where the elements of the product are
 computed on the fly.

 Now I saw in the dlang.org search results that Cartesian products have
 been discussed before, but got roadblocked because the naïve
 implementation of Cartesian product (simply enumerate the first range,
 then popFront the second range when the first runs out and gets restored
 to the initial .save point, etc.) doesn't lend itself well to infinite
 ranges.

 Here's my proposed solution: we can use a trick that mathematicians use
 in set theory to prove that the cardinality of the rationals is equal to
 the cardinality of the natural numbers.

Yah, I call that the Cantor's diagonalization trick - I reckon that's how he proved there are more reals than rationals: http://goo.gl/MTmnQ. We need that badly. Please implement pronto and let us destroy you for having done so. Thanks, Andrei
Oct 09 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 10/09/2012 10:46 PM, H. S. Teoh wrote:
 On Tue, Oct 09, 2012 at 02:25:13PM -0400, Andrei Alexandrescu wrote:
 On 10/9/12 1:52 PM, H. S. Teoh wrote:

 If there is any interest in next_permutation, I'd also like to
 propose a next_even_permutation for ranges without duplicate
 elements. There is a simple way of extending Narayana Pandita's
 in-place algorithm for enumerating all permutations such that only
 even permutations are returned.

Yes, we need next_permutation. It will be up to you to convince the Grand Inquisition Committee of Reviewers that next_even_permutation is necessary and/or sufficient.

Is there much chance of this, if the algorithm only works correctly when the input array has no duplicate elements? AFAIK there's no easy way to ensure unique elements without introducing runtime checks. [...]
 Now I saw in the dlang.org search results that Cartesian products
 have been discussed before, but got roadblocked because the nave
 implementation of Cartesian product (simply enumerate the first
 range, then popFront the second range when the first runs out and
 gets restored to the initial .save point, etc.) doesn't lend itself
 well to infinite ranges.

 Here's my proposed solution: we can use a trick that mathematicians
 use in set theory to prove that the cardinality of the rationals is
 equal to the cardinality of the natural numbers.

Yah, I call that the Cantor's diagonalization trick - I reckon that's how he proved there are more reals than rationals: http://goo.gl/MTmnQ. We need that badly. Please implement pronto and let us destroy you for having done so.

Here it is: [snip code snippet] Destroy! T

That's cute. =) flatten is in std.algorithm and is called joiner. Also, you need to be careful with index types. I'd suggest: import std.algorithm, std.range; auto cartesianProd(R1,R2)(R1 range1, R2 range2) if (isForwardRange!R1 && isInfinite!R1 && isBidirectionalRange!(typeof(take(range2,1))) && isInfinite!R2) { return sequence!"n"(cast(size_t)1).map!( (size_t n)=>range1.save.take(n) // need to specify param type. DMD bug. .zip(range2.save.take(n).retro), ).joiner; }
Oct 09 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/10/2012 01:13 AM, H. S. Teoh wrote:
 On Tue, Oct 09, 2012 at 11:24:27PM +0200, Timon Gehr wrote:
 [...]
 That's cute. =)
 flatten is in std.algorithm and is called joiner.

Ahahahaha... I knew joiner existed; why didn't I think of it. :P
 Also, you need to be careful with index types.
 I'd suggest:

 import std.algorithm, std.range;

 auto cartesianProd(R1,R2)(R1 range1, R2 range2)
 if (isForwardRange!R1 && isInfinite!R1 &&
      isBidirectionalRange!(typeof(take(range2,1))) && isInfinite!R2)
 {
 	return sequence!"n"(cast(size_t)1).map!(
 		(size_t n)=>range1.save.take(n) // need to specify param type. DMD bug.
 		.zip(range2.save.take(n).retro),
 	).joiner;
 }

Ack! UFCS overload! ;-) (just kidding... it's actually a lot easier to read with UFCS.)

I usually use standard function call notation for zip. (but the mail client wraps around the code, so I inserted a manual line break.)
Oct 09 2012
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 10/09/2012 11:23 PM, bearophile wrote:
 Andrei Alexandrescu:

 Yes, we need next_permutation. It will be up to you to convince the
 Grand Inquisition Committee of Reviewers that next_even_permutation is
 necessary and/or sufficient.

I don't like the design of C++ STL here. Instead of a next_permutation(), what about ranges that yield all permutations, all adjacent permutations, all combinations? [snip code]

I agree that that's better conceptually, but your implementations have overflow issues. (And it is not too obvious how to fix that, given the existing range design.)
Oct 09 2012
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/12 4:46 PM, H. S. Teoh wrote:
 On Tue, Oct 09, 2012 at 02:25:13PM -0400, Andrei Alexandrescu wrote:
 On 10/9/12 1:52 PM, H. S. Teoh wrote:

 If there is any interest in next_permutation, I'd also like to
 propose a next_even_permutation for ranges without duplicate
 elements. There is a simple way of extending Narayana Pandita's
 in-place algorithm for enumerating all permutations such that only
 even permutations are returned.

Yes, we need next_permutation. It will be up to you to convince the Grand Inquisition Committee of Reviewers that next_even_permutation is necessary and/or sufficient.

Is there much chance of this, if the algorithm only works correctly when the input array has no duplicate elements? AFAIK there's no easy way to ensure unique elements without introducing runtime checks.

We draw the line at memory safety. If you must assume distinct elements and state that assumption, you're fine as long as you don't transgress into unsafe. One alternative would be to run a O(N) check with probability 1/N, trick that assumeSorted() does.
 [...]
 Now I saw in the dlang.org search results that Cartesian products
 have been discussed before, but got roadblocked because the nave
 implementation of Cartesian product (simply enumerate the first
 range, then popFront the second range when the first runs out and
 gets restored to the initial .save point, etc.) doesn't lend itself
 well to infinite ranges.

 Here's my proposed solution: we can use a trick that mathematicians
 use in set theory to prove that the cardinality of the rationals is
 equal to the cardinality of the natural numbers.

Yah, I call that the Cantor's diagonalization trick - I reckon that's how he proved there are more reals than rationals: http://goo.gl/MTmnQ. We need that badly. Please implement pronto and let us destroy you for having done so.

Here it is: ------------------------------------------------------------------------------- import std.algorithm; import std.range; // I wrote this 'cos I needed it. Why isn't it in std.range? // A better name for this is probably "denest" or "denestOne", since it doesn't // really flatten arbitrarily-nested ranges. auto flatten(R)(R ror)

crossProduct.
 	if (isInputRange!R&&  isInputRange!(ElementType!R))

Tab detected. System overload
Oct 09 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/9/12 7:25 PM, H. S. Teoh wrote:
 I was an idiot. I knew about std.algorithm.joiner but for some reason
 didn't think of it at the time. In any case, crossProduct doesn't make
 any sense to me... I don't see what it's got to do with flattening
 nested ranges.

I meant the resulting range spans the cross product (all combinations) of the passed-in ranges. On another subject, I think this can be done with only input ranges - no need for bidirectional. Andrei
Oct 10 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/12 10:27 AM, H. S. Teoh wrote:
 On Wed, Oct 10, 2012 at 09:41:34AM -0400, Andrei Alexandrescu wrote:
 On 10/9/12 7:25 PM, H. S. Teoh wrote:
 I was an idiot. I knew about std.algorithm.joiner but for some reason
 didn't think of it at the time. In any case, crossProduct doesn't
 make any sense to me... I don't see what it's got to do with
 flattening nested ranges.

I meant the resulting range spans the cross product (all combinations) of the passed-in ranges.

Hmm. We must be using a different set of terms, because in my understanding, a cross product is a binary operation on 3D vectors that produces a perpendicular vector. The product that produces all combinations of a set is known as a Cartesian product to me.

Same thing for sets. I agree using Cartesian is more precise. http://www.transtutors.com/math-homework-help/set-theory/cartesian-product.aspx
 On another subject, I think this can be done with only input ranges -
 no need for bidirectional.

In the case of finite ranges, one of the ranges can be an input range, but the other must at least be a forward range, since otherwise you could only traverse it once, so it would be impossible to combine each element with every element from the other range.

Yah.
 When one of the ranges is infinite, it's obviously not enough for the
 other range to be an input range, since you'd have to produce an
 infinite number of combinations per element (of the second range) before
 moving on to the next element, so you'd never get to it.

Something like that.
 It should be possible to do it with both ranges being forward ranges,
 but it may not be very efficient -- think about it, for every element in
 one range, you need to combine it with every element in the other range,
 so when you're at, say, the 100th element of the first range, you still
 need to go back to the 1st, 2nd, 3rd, etc., elements in the second
 range. But since you can only combine it with a finite number of
 elements from the second range (otherwise you'll never get to the 101th
 element), when you get to the 101th element, you still haven't produced
 some of the combinations of the 100th element yet. So you'll end up
 needing an increasing number of save points in order to get to every
 possible combination -- possible but not efficient.

 Using the Cantor method we can allow one range to be a forward range,
 and the second a range whose take(n) is a reversible range (I was wrong
 before: the second range doesn't need to be bidirectional, it works as
 long as take(n) returns a reversible range). This gives us constant
 space complexity. I'd say this is a pretty good solution.

I'm still thinking of Cantor's method, just a different schedule of spanning the triangle. Multiple save points are necessary. Consider you span the triangle in squares of increasing side. For each side size you first walk along one dimension, then walk along the other. Andrei
Oct 10 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/10/12 10:59 AM, Andrei Alexandrescu wrote:
 I'm still thinking of Cantor's method, just a different schedule of
 spanning the triangle. Multiple save points are necessary.

I meant "Multiple save points are NOT necessary." Andrei
Oct 10 2012
prev sibling parent Don Clugston <dac nospam.com> writes:
On 10/10/12 00:22, bearophile wrote:
 Steven Schveighoffer:

 Is there any advantage over having a function?  I'd think you could
 easily build a range based on the function, no?

Generators (that yield lexicographic permutations, permutation swaps, combinations, etc) are quite more handy, you can compose them with the other ranges and higher order functions.

 Adding to Phobos few ranges that build on a next_permutation(),
 next_combination(), etc, is possible, but I don't see a big need for
 such C++-style functions, the ranges are enough.

What makes you think that permutations are only ever used in a range context? Do you have any evidence for that? Sounds like an abstraction inversion to me.
Oct 10 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, October 09, 2012 14:25:13 Andrei Alexandrescu wrote:
 We need that badly. Please implement pronto and let us destroy you for
 having done so.

Sounds violent. :) - Jonathan M Davis
Oct 09 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 Yes, we need next_permutation. It will be up to you to convince 
 the Grand Inquisition Committee of Reviewers that 
 next_even_permutation is necessary and/or sufficient.

I don't like the design of C++ STL here. Instead of a next_permutation(), what about ranges that yield all permutations, all adjacent permutations, all combinations? Permutations: import std.algorithm, std.conv, std.traits; struct Permutations(bool doCopy=true, T) if (isMutable!T) { private immutable size_t num; private T[] items; private uint[31] indexes; private ulong tot; this (/*in*/ T[] items) /*pure*/ nothrow in { static enum string L = text(indexes.length); // impure assert(items.length >= 0 && items.length <= indexes.length, "Permutations: items.length must be >= 0 && < " ~ L); } body { static ulong factorial(in uint n) pure nothrow { ulong result = 1; foreach (i; 2 .. n + 1) result *= i; return result; } this.num = items.length; this.items = items.dup; foreach (i; 0 .. cast(typeof(indexes[0]))this.num) this.indexes[i] = i; this.tot = factorial(this.num); } property T[] front() pure nothrow { static if (doCopy) //return items.dup; // not nothrow return items ~ []; // slower else return items; } property bool empty() const pure nothrow { return tot == 0; } void popFront() pure nothrow { tot--; if (tot > 0) { size_t j = num - 2; while (indexes[j] > indexes[j + 1]) j--; size_t k = num - 1; while (indexes[j] > indexes[k]) k--; swap(indexes[k], indexes[j]); swap(items[k], items[j]); size_t r = num - 1; size_t s = j + 1; while (r > s) { swap(indexes[s], indexes[r]); swap(items[s], items[r]); r--; s++; } } } } Permutations!(doCopy,T) permutations(bool doCopy=true, T) (/*in*/ T[] items) /*pure*/ nothrow if (isMutable!T) { return Permutations!(doCopy, T)(items); } unittest { import std.bigint; foreach (p; permutations([BigInt(1), BigInt(2), BigInt(3)])) assert((p[0] + 1) > 0); } version (permutations2_main) { void main() { import std.stdio, std.bigint; foreach (p; permutations!false([1, 2, 3])) writeln(p); alias BigInt B; foreach (p; permutations!false([B(1), B(2), B(3)])) {} } } ---------------------------------- Adjacent permutations by swapping (to be converted in a range): import std.algorithm, std.array, std.typecons, std.range; struct Spermutations { const int n; alias Tuple!(int[], int) TResult; int opApply(int delegate(ref TResult) dg) { int result; TResult aux; int sign = 1; alias Tuple!(int, int) Pair; auto p = iota(n).map!(i => Pair(i, i ? -1 : 0))().array(); aux = tuple(p.map!(pp => pp[0])().array(), sign); result = dg(aux); if (result) goto END; while (p.canFind!(pp => pp[1])()) { // Failed using std.algorithm here, too much complex auto largest = Pair(-100, -100); int i1 = -1; foreach (i, pp; p) if (pp[1]) { if (pp[0] > largest[0]) { i1 = i; largest = pp; } } int n1 = largest[0], d1 = largest[1]; sign *= -1; int i2; if (d1 == -1) { i2 = i1 - 1; swap(p[i1], p[i2]); if (i2 == 0 || p[i2 - 1][0] > n1) p[i2][1] = 0; } else if (d1 == 1) { i2 = i1 + 1; swap(p[i1], p[i2]); if (i2 == n - 1 || p[i2 + 1][0] > n1) p[i2][1] = 0; } aux = tuple(p.map!(pp => pp[0])().array(), sign); result = dg(aux); if (result) goto END; foreach (i3, ref pp; p) { auto n3 = pp[0], d3 = pp[1]; if (n3 > n1) pp[1] = (i3 < i2) ? 1 : -1; } } END: return result; } } void main() { import std.stdio; foreach (n; [3, 4]) { writefln("\nPermutations and sign of %d items", n); foreach (tp; Spermutations(n)) writefln("Perm: %s Sign: %2d", tp.tupleof); } } ---------------------------------- Combinations (to be converted in a range): ulong binomial(long n, long k) pure nothrow in { assert(n > 0, "binomial: n must be > 0."); } body { if (k < 0 || k > n) return 0; if (k > (n / 2)) k = n - k; ulong result = 1; foreach (ulong d; 1 .. k + 1) { result *= n; n--; result /= d; } return result; } struct Combinations(T, bool copy=true) { // Algorithm by Knuth, Pre-fascicle 3A, draft of // section 7.2.1.3: "Generating all partitions". T[] items; int k; size_t len = -1; // computed lazily this(in T[] items, in int k) in { assert(items.length, "combinations: items can't be empty."); } body { this.items = items.dup; this.k = k; } property size_t length() /*logic_const*/ { if (len == -1) // set cache len = cast(size_t)binomial(items.length, k); return len; } int opApply(int delegate(ref T[]) dg) { if (k == items.length) return dg(items); // yield items auto outarr = new T[k]; if (k == 0) return dg(outarr); // yield [] if (k < 0 || k > items.length) return 0; // yield nothing int result, x; immutable n = items.length; auto c = new uint[k + 3]; // c[0] isn'k used foreach (j; 1 .. k + 1) c[j] = j - 1; c[k + 1] = n; c[k + 2] = 0; int j = k; while (true) { // The following lines equal to: //int pos; //foreach (i; 1 .. k +1) // outarr[pos++] = items[c[i]]; auto outarr_ptr = outarr.ptr; auto c_ptr = &(c[1]); auto c_ptrkp1 = &(c[k + 1]); while (c_ptr != c_ptrkp1) *outarr_ptr++ = items[*c_ptr++]; static if (copy) { auto outarr2 = outarr.dup; result = dg(outarr2); // yield outarr2 } else { result = dg(outarr); // yield outarr } if (j > 0) { x = j; c[j] = x; j--; continue; } if ((c[1] + 1) < c[2]) { c[1]++; continue; } else j = 2; while (true) { c[j - 1] = j - 2; x = c[j] + 1; if (x == c[j + 1]) j++; else break; } if (j > k) return result; // End c[j] = x; j--; } } } Combinations!(T,copy) combinations(bool copy=true, T) (in T[] items, in int k) in { assert(items.length, "combinations: items can't be empty."); } body { return Combinations!(T, copy)(items, k); } ---------------------------------- Bye, bearophile
Oct 09 2012
prev sibling next sibling parent "ixid" <nuaccount gmail.com> writes:
 struct Spermutations {

The naming scheme could do with... improvement.
Oct 09 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
ixid:

 The naming scheme could do with... improvement.

struct PermutationSwaps then? Bye, bearophile
Oct 09 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 09 Oct 2012 17:23:07 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Andrei Alexandrescu:

 Yes, we need next_permutation. It will be up to you to convince the  
 Grand Inquisition Committee of Reviewers that next_even_permutation is  
 necessary and/or sufficient.

I don't like the design of C++ STL here. Instead of a next_permutation(), what about ranges that yield all permutations, all adjacent permutations, all combinations?

Is there any advantage over having a function? I'd think you could easily build a range based on the function, no? -Steve
Oct 09 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Is there any advantage over having a function?  I'd think you 
 could easily build a range based on the function, no?

Generators (that yield lexicographic permutations, permutation swaps, combinations, etc) are quite more handy, you can compose them with the other ranges and higher order functions. Adding to Phobos few ranges that build on a next_permutation(), next_combination(), etc, is possible, but I don't see a big need for such C++-style functions, the ranges are enough. Bye, bearophile
Oct 09 2012
prev sibling parent "ixid" <nuaccount gmail.com> writes:
On Tuesday, 9 October 2012 at 22:06:42 UTC, bearophile wrote:
 ixid:

 The naming scheme could do with... improvement.

struct PermutationSwaps then? Bye, bearophile

I think you're missing what I meant, naming a struct "Sperm"-utations seemed a bit unpleasant, I'm not actually bothered about your naming scheme. =)
Oct 09 2012