www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Array comprehensions and the like

reply bearophile <bearophileHUGS lycos.com> writes:
I have already discussed about this topic a couple of times in the past, but
this time I have more things to talk about :-)

I think Python-like array comprehensions and the like are quite useful, their
light syntax replaces the need for map, filter, xfilter, xmap, some usages of
flatten, xflatten, and some more things, while being quite readable still and
short.

Recently a DMDScript too has gained array comprehensions:
http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7#Array_comprehensions


Few examples in Python 3:

# eager list (array) comp, 1D:
a = [x * x for x in lazyit if x > 5]

# eager list comp, 2D:
parts = [[part.strip() for part in line.split()] for line in file("txt")]

# set comprehension
squares = {x*x for x in xrange(100)}

# dict comprehension
multi = {c:c*i for i,c in enumerate("abcd")}

# lazy, 1D
 mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]
 flat = (abs(el) for row in mat for el in row if el % 2)
 flat



 list(flat)



----------------- This is how you can express them with normal D v.1 syntax: int[] a; foreach (x; lazyit) if (x > 5) a ~= x * x; string[][] parts; foreach (line; xfile("txt")) { auto split_line = line.split(); auto line_parts = new string[split_line.length]; foreach (i, part; split_line) line_parts[i] = part; parts ~= line_parts; } uint[int] squares; for (int i; i < 100; i++) squares[i * i] = 0; string[char] multi; foreach (i, c; "abcdef") { string c_i = new char[i]; c_i[] = c; multi[c] = c_i; } struct Process(TyItem) { TyItem[][] mat; int opApply(int delegate(ref TyItem) dg) { int result; foreach (row; this.mat) foreach (el; row) if (el % 2) { TyItem aux = abs(el); result = dg(aux); if (result) break; } return result; } } ... auto flat = Process!(typeof(mat[0][0]))(mat); ----------------- This is how they can be expressed in D with my functional libs, note that x-something are lazy (if he wants, downs may show how to write them with his tools): auto a = filter((int x){return x > 5;}, xmap((int x){return x * x;}, lazyit)); auto parts = map((string l){return map(strip, l);}, xfile("txt")); Or shorter (and sometimes more readable): int x; auto parts = select(x * x, lazyit, x > 5); string line; auto parts = select(map(strip, l), line, xfile("txt")); auto squares = set( xmap((int x){return x*x;}, xrange(100)) ); auto multi = dict(map2((int i, char c){return toStruct(c, mulItem(c, i));}, "abcdef")); Or: string s = "abcdef"; auto multi = dict(s, map2((int i, char c){return mulItem(c, i);}, s)); I've tried this too, to swap the arguments of mulItem(), but std.bind is too much fragile for most purposes of mine (or I am not able to use it properly): auto multi = dict(map2(&bind(&mulItem!(char), _1, _0), "abcdef")); auto flat = xflatten(xmap((int[] r){return xmapFilter(&abs, r, (int x){return x%2;});}, mat)); The xflatten() is a quite complex lazy functional, it flattens iterables of any kind with any depth, but the tree of iterables must be complete. With static typing I think you need variants to allow a not complete tree. With a splitting it becomes a bit more readable, but it's hairy still: auto aux = (int[] row){return xmapFilter(&abs, row, (int el){return el % 2;});}; auto flat2 = xflatten(xmap(aux, mat)); As you can see the D versions are syntax-heavy (especially the last ones), quite less easy to read, you need good memory or a good IDE to remember the names of the various higher order functions, there are various alternative ways to write them, and often you leave an extra bracket or you add one too much, etc. So much, that in real code I'm not going to use some of those hairy lines (while the original python lines can be acceptable still). This is a good example that shows that if you want to use a functional-style of coding, the syntax of your language must often be short and good. ----------------- Possible D 2.x syntax (first try at a possible syntax, they may be improved): auto a = [x * x foreach(x; lazyit) if (x > 5)]; auto parts = [[part.strip() foreach(part; line.split())] foreach(line; xfile("txt"))]; With this syntax you can't use auto: void[int] squares = [x*x foreach(x; 0..100)]; Alternative that allows the use of auto too, the void denotes a set: auto squares = void[x*x foreach(x; 0..100)]; That is related to the syntax for built-in sets I have suggested recently: void[int] s; // set s s ~= 56; s & s2 // intersection, etc. Here auto is enough because the comma denotes an AA: auto multi = [c:mulIter(c, i) foreach(i,c; "abcdef")]; Lazy: auto flat = (abs(el) foreach(row; mat) foreach(el; row) if (el % 2)); ----------------- Partially related for Walter: recently you have asked in this newsgroup for articles regarding a functional style of D. In the last days I have posted one such blog post, that's quite long and contains timings, explanations, links, etc. It compares various implementations of a small but real program in Python, Python-looking D, normal D, C-style D, C, ML, and more. Bye, bearophile
Jul 18 2008
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Here I have missed a small thing:

string[][] parts;
foreach (line; xfile("txt")) {
    auto split_line = line.split();
    auto line_parts = new string[split_line.length];
    foreach (i, part; split_line)
        line_parts[i] = part;
    parts ~= line_parts;
}

More complete:

string[][] parts;
foreach (line; xfile("txt")) {
    auto split_line = line.split();
    auto line_parts = new string[split_line.length];
    foreach (i, part; split_line)
        line_parts[i] = part.strip();
    parts ~= line_parts;
}

Bye,
bearophile
Jul 18 2008
prev sibling parent reply downs <default_357-line yahoo.de> writes:
For what it's worth, here's the tools versions:

bearophile wrote:
 I have already discussed about this topic a couple of times in the past, but
this time I have more things to talk about :-)
 
 I think Python-like array comprehensions and the like are quite useful, their
light syntax replaces the need for map, filter, xfilter, xmap, some usages of
flatten, xflatten, and some more things, while being quite readable still and
short.
 
 Recently a DMDScript too has gained array comprehensions:
 http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7#Array_comprehensions
 
 
 Few examples in Python 3:
 
 # eager list (array) comp, 1D:
 a = [x * x for x in lazyit if x > 5]

 
 # eager list comp, 2D:
 parts = [[part.strip() for part in line.split()] for line in file("txt")]
 

 # set comprehension
 squares = {x*x for x in xrange(100)}
 

 # dict comprehension
 multi = {c:c*i for i,c in enumerate("abcd")}
 

 # lazy, 1D
 mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]
 flat = (abs(el) for row in mat for el in row if el % 2)
 flat




T[] iterToField(T)(void delegate(void delegate(T)) dg) { T[] res; dg((T t) { res ~= t; }); return res; } auto mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]; auto flat = (void delegate(ref int) dg) { foreach (ref row; mat) foreach (ref el; row) if (el % 2) dg(el); };
 <generator object at 0x00A8E1E8>
 list(flat)




--downs
Jul 18 2008
parent bearophile <bearophileHUGS lycos.com> writes:
downs:

 For what it's worth, here's the tools versions:

If you show here how to use your tools people may learn (how) to use them a bit more.
 # eager list comp, 2D:
 parts = [[part.strip() for part in line.split()] for line in file("txt")]
 


Sorry, that Python version of mine is "wrong", because the split splits according to the white space, so a successive stripping is useless, so a more meaningful code may be: parts = [[part.strip() for part in line.split(",")] for line in file("txt")] The Python version scans the lines lazily, so it works on mega files too. Your version is eager according to the lines, I presume. In my code I have used xfile() that's lazy. But even if you change your code like this and ignore the lazy thing:
 auto parts = (cast(string) "txt".read()).split("\n") /map/ (string s) { return
s.split(",").strip(); };

I think it does something different, because you may need two /map/ to emulate the two for of the Python version.
 # set comprehension
 squares = {x*x for x in xrange(100)}
 


That Python version creates a set, while isn't your squares an array?
 # lazy, 1D
 mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]
 flat = (abs(el) for row in mat for el in row if el % 2)
 flat




T[] iterToField(T)(void delegate(void delegate(T)) dg) { T[] res; dg((T t) { res ~= t; }); return res; } auto mat = [[0, 3, 0, 3], [9, 8, 1, 4], [6, 7, 6, 6]]; auto flat = (void delegate(ref int) dg) { foreach (ref row; mat) foreach (ref el; row) if (el % 2) dg(el); };
 <generator object at 0x00A8E1E8>
 list(flat)





I don't see the abs(), but that's not much important. Bye, bearophile
Jul 19 2008