digitalmars.D - Array comprehensions and the like

• bearophile (98/102) Jul 18 2008 I have already discussed about this topic a couple of times in the past,...
• bearophile (20/20) Jul 18 2008 Here I have missed a small thing:
• downs (16/46) Jul 18 2008 auto a = lazyit /select/ ex!("x -> x > 5") /map/ ex!("x -> x*x");
• bearophile (11/41) Jul 19 2008 Sorry, that Python version of mine is "wrong", because the split splits ...
bearophile <bearophileHUGS lycos.com> writes:
```I have already discussed about this topic a couple of times in the past, but
this time I have more things to talk about :-)

I think Python-like array comprehensions and the like are quite useful, their
light syntax replaces the need for map, filter, xfilter, xmap, some usages of
flatten, xflatten, and some more things, while being quite readable still and
short.

Recently a DMDScript too has gained array comprehensions:
http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7#Array_comprehensions

Few examples in Python 3:

# eager list (array) comp, 1D:
a = [x * x for x in lazyit if x > 5]

# eager list comp, 2D:
parts = [[part.strip() for part in line.split()] for line in file("txt")]

# set comprehension
squares = {x*x for x in xrange(100)}

# dict comprehension
multi = {c:c*i for i,c in enumerate("abcd")}

# lazy, 1D
mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]
flat = (abs(el) for row in mat for el in row if el % 2)
flat

<generator object at 0x00A8E1E8>
list(flat)

[3, 3, 9, 1, 7]

-----------------

This is how you can express them with normal D v.1 syntax:

int[] a;
foreach (x; lazyit)
if (x > 5)
a ~= x * x;

string[][] parts;
foreach (line; xfile("txt")) {
auto split_line = line.split();
auto line_parts = new string[split_line.length];
foreach (i, part; split_line)
line_parts[i] = part;
parts ~= line_parts;
}

uint[int] squares;
for (int i; i < 100; i++)
squares[i * i] = 0;

string[char] multi;
foreach (i, c; "abcdef") {
string c_i = new char[i];
c_i[] = c;
multi[c] = c_i;
}

struct Process(TyItem) {
TyItem[][] mat;
int opApply(int delegate(ref TyItem) dg) {
int result;
foreach (row; this.mat)
foreach (el; row)
if (el % 2) {
TyItem aux = abs(el);
result = dg(aux);
if (result) break;
}
return result;
}
}
...
auto flat = Process!(typeof(mat[0][0]))(mat);

-----------------

This is how they can be expressed in D with my functional libs, note that
x-something are lazy (if he wants, downs may show how to write them with his
tools):

auto a = filter((int x){return x > 5;}, xmap((int x){return x * x;}, lazyit));

auto parts = map((string l){return map(strip, l);}, xfile("txt"));

Or shorter (and sometimes more readable):

int x;
auto parts = select(x * x, lazyit, x > 5);

string line;
auto parts = select(map(strip, l), line, xfile("txt"));

auto squares = set( xmap((int x){return x*x;}, xrange(100)) );

auto multi = dict(map2((int i, char c){return toStruct(c, mulItem(c, i));},
"abcdef"));

Or:
string s = "abcdef";
auto multi = dict(s, map2((int i, char c){return mulItem(c, i);}, s));

I've tried this too, to swap the arguments of mulItem(), but std.bind is too
much fragile for most purposes of mine (or I am not able to use it properly):
auto multi = dict(map2(&bind(&mulItem!(char), _1, _0), "abcdef"));

auto flat = xflatten(xmap((int[] r){return xmapFilter(&abs, r, (int x){return
x%2;});}, mat));

The xflatten() is a quite complex lazy functional, it flattens iterables of any
kind with any depth, but the tree of iterables must be complete. With static
typing I think you need variants to allow a not complete tree.

With a splitting it becomes a bit more readable, but it's hairy still:
auto aux = (int[] row){return xmapFilter(&abs, row, (int el){return el % 2;});};
auto flat2 = xflatten(xmap(aux, mat));

As you can see the D versions are syntax-heavy (especially the last ones),
quite less easy to read, you need good memory or a good IDE to remember the
names of the various higher order functions, there are various alternative ways
to write them, and often you leave an extra bracket or you add one too much,
etc. So much, that in real code I'm not going to use some of those hairy lines
(while the original python lines can be acceptable still).
This is a good example that shows that if you want to use a functional-style of
coding, the syntax of your language must often be short and good.

-----------------

Possible D 2.x syntax (first try at a possible syntax, they may be improved):

auto a = [x * x foreach(x; lazyit) if (x > 5)];
auto parts = [[part.strip() foreach(part; line.split())] foreach(line;
xfile("txt"))];

With this syntax you can't use auto:
void[int] squares = [x*x foreach(x; 0..100)];

Alternative that allows the use of auto too, the void denotes a set:
auto squares = void[x*x foreach(x; 0..100)];

That is related to the syntax for built-in sets I have suggested recently:
void[int] s; // set s
s ~= 56;
s & s2 // intersection, etc.

Here auto is enough because the comma denotes an AA:
auto multi = [c:mulIter(c, i) foreach(i,c; "abcdef")];

Lazy:
auto flat = (abs(el) foreach(row; mat) foreach(el; row) if (el % 2));

-----------------

Partially related for Walter: recently you have asked in this newsgroup for
articles regarding a functional style of D. In the last days I have posted one
such blog post, that's quite long and contains timings, explanations, links,
etc. It compares various implementations of a small but real program in Python,
Python-looking D, normal D, C-style D, C, ML, and more.

Bye,
bearophile
```
Jul 18 2008
bearophile <bearophileHUGS lycos.com> writes:
```Here I have missed a small thing:

string[][] parts;
foreach (line; xfile("txt")) {
auto split_line = line.split();
auto line_parts = new string[split_line.length];
foreach (i, part; split_line)
line_parts[i] = part;
parts ~= line_parts;
}

More complete:

string[][] parts;
foreach (line; xfile("txt")) {
auto split_line = line.split();
auto line_parts = new string[split_line.length];
foreach (i, part; split_line)
line_parts[i] = part.strip();
parts ~= line_parts;
}

Bye,
bearophile
```
Jul 18 2008
downs <default_357-line yahoo.de> writes:
```For what it's worth, here's the tools versions:

bearophile wrote:
I have already discussed about this topic a couple of times in the past, but
this time I have more things to talk about :-)

I think Python-like array comprehensions and the like are quite useful, their
light syntax replaces the need for map, filter, xfilter, xmap, some usages of
flatten, xflatten, and some more things, while being quite readable still and
short.

Recently a DMDScript too has gained array comprehensions:
http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7#Array_comprehensions

Few examples in Python 3:

# eager list (array) comp, 1D:
a = [x * x for x in lazyit if x > 5]

auto a = lazyit /select/ ex!("x -> x > 5") /map/ ex!("x -> x*x");

# eager list comp, 2D:
parts = [[part.strip() for part in line.split()] for line in file("txt")]

auto parts = (cast(string) "txt".read()).split("\n") /map/ (string s) { return
s.split(" ").strip(); };

# set comprehension
squares = {x*x for x in xrange(100)}

auto squares = Range[100] /map/ ex!("x -> x*x");

# dict comprehension
multi = {c:c*i for i,c in enumerate("abcd")}

Can't do that yet. Sad.
# lazy, 1D
mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]
flat = (abs(el) for row in mat for el in row if el % 2)
flat

T[] iterToField(T)(void delegate(void delegate(T)) dg) {
T[] res; dg((T t) { res ~= t; });
return res;
}

auto mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]];
auto flat = (void delegate(ref int) dg) {
foreach (ref row; mat) foreach (ref el; row) if (el % 2) dg(el);
};

<generator object at 0x00A8E1E8>
list(flat)

[3, 3, 9, 1, 7]

writefln(iterToField(flat));

--downs
```
Jul 18 2008
bearophile <bearophileHUGS lycos.com> writes:
```downs:

For what it's worth, here's the tools versions:

If you show here how to use your tools people may learn (how) to use them a bit
more.

# eager list comp, 2D:
parts = [[part.strip() for part in line.split()] for line in file("txt")]

auto parts = (cast(string) "txt".read()).split("\n") /map/ (string s) { return
s.split(" ").strip(); };

Sorry, that Python version of mine is "wrong", because the split splits
according to the white space, so a successive stripping is useless, so a more
meaningful code may be:
parts = [[part.strip() for part in line.split(",")] for line in file("txt")]

The Python version scans the lines lazily, so it works on mega files too. Your
version is eager according to the lines, I presume. In my code I have used
xfile() that's lazy.
But even if you change your code like this and ignore the lazy thing:

auto parts = (cast(string) "txt".read()).split("\n") /map/ (string s) { return
s.split(",").strip(); };

I think it does something different, because you may need two /map/ to emulate
the two for of the Python version.

# set comprehension
squares = {x*x for x in xrange(100)}

auto squares = Range[100] /map/ ex!("x -> x*x");

That Python version creates a set, while isn't your squares an array?

# lazy, 1D
mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]
flat = (abs(el) for row in mat for el in row if el % 2)
flat

T[] iterToField(T)(void delegate(void delegate(T)) dg) {
T[] res; dg((T t) { res ~= t; });
return res;
}

auto mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]];
auto flat = (void delegate(ref int) dg) {
foreach (ref row; mat) foreach (ref el; row) if (el % 2) dg(el);
};

<generator object at 0x00A8E1E8>
list(flat)

[3, 3, 9, 1, 7]

writefln(iterToField(flat));

I don't see the abs(), but that's not much important.

Bye,
bearophile
```
Jul 19 2008