www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 5550] New: std.range.enumerate()

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550

           Summary: std.range.enumerate()
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



I suggest to add an enumerate() to Phobos std.range module. An alternative name
is count().

enumerate() is a simple function, it takes one iterable and returns an iterable
of (index, item) pairs:

 list(enumerate("abcd"))
[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')] A second optional argument allows to define a start value different from zero:
 list(enumerate("abcd", 3))
[(3, 'a'), (4, 'b'), (5, 'c'), (6, 'd')] This is an example usage of Python2.6 enumerate():
 comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]
 [i for i,b in enumerate(comp) if not b]
[2, 3, 5, 7, 11, 13, 17, 19]
 comp = comp[2:]
 comp
[0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]
 [i for i,b in enumerate(comp, 2) if not b]
[2, 3, 5, 7, 11, 13, 17, 19] Here the input 'comp' is a list of bits (booleans) produced by a Sieve of Eratosthenes, that represents the composite numbers. The desired output (a sequence of primes) is a list of indexes where the value b is false, so it's prime. This is a D2 translation in procedural style: import std.stdio; void main() { auto comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]; int[] result; foreach (i, p; comp) if (!p) result ~= i; writeln(result); comp = comp[2..$]; result.length = 0; foreach (i, p; comp) if (!p) result ~= i + 2; writeln(result); } A translation in functional style: import std.stdio, std.algorithm, std.range; void main() { auto comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]; auto result1 = map!q{a[0]}(filter!q{!a[1]}(zip(iota(comp.length), comp))); writeln(result1); comp = comp[2..$]; auto result2 = map!q{a[0]+2}(filter!q{!a[1]}(zip(iota(comp.length), comp))); writeln(result2); } An enumerate() allows to write simpler code: import std.stdio, std.algorithm, std.range; void main() { auto comp = [1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1]; auto result1 = map!q{a[0]}(filter!q{!a[1]}(enumerate(comp))); writeln(result1); comp = comp[2..$]; auto result2 = map!q{a[0]}(filter!q{!a[1]}(enumerate(comp, 2))); writeln(result2); } zip(iota(foo.length), foo) becomes even worse when the iterable foo is lazy and doesn't define a length: zip(iota(walkLength(foo)), foo) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 08 2011
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550




This is a basic implementation of enumerate() (it's only an InputRange, but
probably a richer range too should be supported).

The demo code in main() shows the task of processing the Sieve of Eratosthenes
flags without and with an enumerate(). The version with enumerate() is shorter
and simpler to understand than the version with zip.


import std.stdio, std.algorithm, std.range, std.typecons, std.traits,
std.array;

struct Enumerate(R) {
    R r;
    int i;
    alias r this;

     property Tuple!(typeof(this.i), typeof(R.init.front)) front() {
        return typeof(return)(i, this.r.front);
    }

    void popFront() {
        this.r.popFront();
        this.i++;
    }
}

Enumerate!R enumerate(R)(R range, int start=0) if (isInputRange!R) {
    return Enumerate!R(range, start);
}

void main() {
    // not prime flags, from a Sieve of Eratosthenes.
    // 0 = prime number, 1 = not prime number. starts from index 2.
    auto flags = [0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];

    // not using enumerate
    iota(2, int.max).zip(flags).filter!q{!a[1]}().map!q{a[0]}().writeln();
    iota(int.max).zip(flags).filter!q{!a[1]}().map!q{a[0] + 2}().writeln();

    // using enumerate
    //flags.enumerate(2).filter!q{!a[1]}().map!q{a[0]}().writeln(); // error
    filter!q{!a[1]}(flags.enumerate(2)).map!q{a[0]}().writeln();
}



Note: this produces a compilation error, I don't know why yet, maybe it's a bad
interaction with alias this:

flags.enumerate(2).filter!q{!a[1]}().map!q{a[0]}().writeln();

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550




The problem was caused by the alias this. This avoids the problem:


import std.stdio, std.algorithm, std.range, std.typecons, std.traits,
std.array;

struct Enumerate(R) {
    R r;
    int i;

     property bool empty() {
        return this.r.empty;
    }

     property Tuple!(typeof(this.i), typeof(R.init.front)) front() {
        return typeof(return)(i, this.r.front);
    }

    void popFront() {
        this.r.popFront();
        this.i++;
    }
}

Enumerate!R enumerate(R)(R range, int start=0) if (isInputRange!R) {
    return Enumerate!R(range, start);
}

void main() {
    auto flags = [0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1];

    flags.enumerate(2).filter!q{!a[1]}().map!q{a[0]}().writeln();
}


The last line is noisy but it's readable.

This is less noisy but longer:

flags.enumerate(2).filter!(t => !t[1])().map!(t => t[0])().writeln();

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 26 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550




See also Issue 8155

----------

enumerate and AA.byPair() are also handy when you need to count associative
array key-value pairs:

foreach (i, k, v; associativeArray.byPair.enumerate()) {...}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 25 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550




A well implemented enumerate() seems useful to avoid one of my recurrent D
bugs. This is a simplified example to show the bug. I have to create a matrix
like this:

0 10 20 30
0 11 21 31
0 12 22 32


There are many ways to inizialize a matrix like that, this is one way, but it's
not complete:


import std.stdio;
void main() {
    auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
    foreach (r, row; M)
        foreach (c, ref item; row)
            item = c * 10 + r;
    writefln("%(%s\n%)", M);
}


It outputs:

[0, 10, 20, 30]
[1, 11, 21, 31]
[2, 12, 22, 32]


To complete the algorithm, that is to not touch the first column, I can use
just a slice in the second foreach, to not touch the first column:


import std.stdio;
void main() {
    auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
    foreach (r, row; M)
        foreach (c, ref item; row[1 .. $])
            item = c * 10 + r;
    writefln("%(%s\n%)", M);
}


But this introduces the bug:

[0, 0, 10, 20]
[0, 1, 11, 21]
[0, 2, 12, 22]


Slicing 'row' from the second item I avoid to write on the first cells of each
row, but the 'c' index doesn't get sliced, it starts from zero still. One
correct version needs to increment c by one in the formula:

import std.stdio;
void main() {
    auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
    foreach (r, row; M)
        foreach (c, ref item; row[1 .. $])
            item = (c + 1) * 10 + r;
    writefln("%(%s\n%)", M);
}


Another solution is to ignore the c==0:

    foreach (r, row; M)
        foreach (c, ref item; row)
            if (c != 0)
                item = c * 10 + r;


A well implemented enumerate() is one way to avoid that problem (other
solutions are possible), now 'c' gets sliced, so it doesn't start from zero:


import std.stdio;
void main() {
    auto M = [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]];
    foreach (r, row; M)
        foreach (c, ref item; enumerate(row)[1 .. $])
            item = c * 10 + r;
    writefln("%(%s\n%)", M);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jul 13 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550




A comment by Christophe Travert:

 enumerate could be useful with retro too. You may want to change the 
 order of the enumeration, but not the order of the indices.
-- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 16 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550


bioinfornatics <bioinfornatics gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bioinfornatics gmail.com



05:05:05 PST ---
I agree this is really an important feature when using Phobos range.

My need it is to get an enumerate isntance who lzazy evaluate (index, nth
elements) to avoid copy/construct the new range which cost cpu time with a big
range

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Nov 20 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550




An alternative idea is to introduce the "imap" and "ifilter" functions, similar



import std.stdio: writeln;
import std.algorithm: imap;
void main() {
    "abcd".imap!(t => t).writeln;
}


Should print:

[Tuple!(uint, dchar)(0, 'a'), Tuple!(uint, dchar)(1, 'b'), Tuple!(uint,
dchar)(2, 'c'), Tuple!(uint, dchar)(3, 'd')]

This means that imap gives to the mapping function a tuple that contains the
index and the item. ifilter works similarly (unfortunately D doesn't yet have a
syntax for tuple unpacking in function signatures).

This is nice and handy, but enumerate() can be used in more situations. On the
other hand imap is a little shorter:

"abcd".imap!(t => t).writeln;

"abcd".enumerate.imap!(t => t).writeln;

In the end I prefer enumerate.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 04 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=5550





 In the end I prefer enumerate.
"enumerate.map" is more composable than "imap". But iFilter is handy. See Issue 9841 -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 30 2013