digitalmars.D - Improving std.range.Zip

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= (16/16) Oct 24 2010 I have noticed an emerging idiom in my code lately: bring together n

Philippe Sigaud (18/30) Oct 24 2010 ed

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= (8/18) Oct 24 2010 It could be. As long as the job gets done it doesn't matter what will be...

bearophile (9/14) Oct 24 2010 But lot of time ago I have said that in my opinion that's not the best d...

Simen kjaeraas (7/24) Oct 24 2010 From what I can see, map currently simply doesn't support passing it

bearophile (4/7) Oct 24 2010 If you may have multiple of both then the situation becomes complex. So ...
Andrei Alexandrescu (5/35) Oct 24 2010 This is coming full circle. At a point, map _did_ support multiple

KennyTM~ (7/10) Oct 24 2010 Except that at "that point"[1], map's multi-range support is equivalent ...

Andrei Alexandrescu (3/13) Oct 24 2010 Oh, you're right. Sorry!
=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= (13/24) Oct 25 2010 )

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= (4/5) Oct 25 2010 reduce is of course not a range ;)
bearophile (40/49) Oct 25 2010 This is a general and interesting design topic. The short answer is that...

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= <just ask.me> writes:

I have noticed an emerging idiom in my code lately: bring together n  
ranges, transform them to one range using a n-ary function. Currently it's  
achieved with:

map!((a) {return myNaryFun(a._0, a._1, ...); })(zip(range1, range2, ...));

It's a bit of a nuisanse -- rarely do my transforming functions take  
tuples, and there's the necessity of composing 2 higher-order ranges to  
render fairly common functionality. I think Zip could be further  
parametrized with a zipper function, so that the above code would boil  
down to:

zip!myNaryFun(range1, range2, ...);

Having looked at Zip's source, such change shouldn't be a big deal. Oh,  
and the default zipper function would be std.typecons.tuple, not to  
trouble those not keen on generalization.

Good?

-- 
Tomek

Oct 24 2010

Philippe Sigaud <philippe.sigaud gmail.com> writes:

2010/10/24 Tomek Sowi=C5=84ski <just ask.me>:
 I have noticed an emerging idiom in my code lately: bring together n rang=

es,
 transform them to one range using a n-ary function. Currently it's achiev=

ed
 with:

 map!((a) {return myNaryFun(a._0, a._1, ...); })(zip(range1, range2, ...))=

;
 It's a bit of a nuisanse -- rarely do my transforming functions take tupl=

es,
 and there's the necessity of composing 2 higher-order ranges to render
 fairly common functionality. I think Zip could be further parametrized wi=

th
 a zipper function, so that the above code would boil down to:

 zip!myNaryFun(range1, range2, ...);

 Having looked at Zip's source, such change shouldn't be a big deal. Oh, a=

nd
 the default zipper function would be std.typecons.tuple, not to trouble
 those not keen on generalization.

That's what Haskell calls ZipWith. I called it tmap (as in tuple-map)
when I needed it in D.
IMHO, it should be a generalization of std.algorithm.map: let it
accept n ranges and a n-ary function. It can even do a partial check
at compile-time.
Extending filter() and reduce() to work with n ranges in parallel and
a n-args function is also useful.

you may be interested in looking there:

http://www.dsource.org/projects/dranges/

(more specifically, in the algorithm module)

Philippe

Oct 24 2010

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= <just ask.me> writes:

Dnia 24-10-2010 o 21:34:54 Philippe Sigaud <philippe.sigaud gmail.com>  =

napisa=B3(a):

 That's what Haskell calls ZipWith. I called it tmap (as in tuple-map)
 when I needed it in D.
 IMHO, it should be a generalization of std.algorithm.map: let it
 accept n ranges and a n-ary function. It can even do a partial check
 at compile-time.

It could be. As long as the job gets done it doesn't matter what will be=
  =

generalized, Map or Zip. One will cover other's functionality entirely.

 Extending filter() and reduce() to work with n ranges in parallel and
 a n-args function is also useful.
 you may be interested in looking there:
 http://www.dsource.org/projects/dranges/
 (more specifically, in the algorithm module)

Nice lib. Any plans to incorporate some of it to Phobos?

-- =

Tomek

Oct 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Tomek S.:

 map!((a) {return myNaryFun(a._0, a._1, ...); })(zip(range1, range2, ...));

Currently the docs of std.algorithm.map say:

Multiple functions can be passed to map. In that case, the element type of map 
is a tuple containing one element for each function.<

But lot of time ago I have said that in my opinion that's not the best design.
Python has a different design, its map does what you want, you may write your
code in Python as:

map(myNaryFun, range1, range2, ...)

An example (Python 2.6):

 a = ["a", "b", "c", "d"]
 b = [1, 2, 3, 4]
 map(lambda c,n: c * n, a, b)



['a', 'bb', 'ccc', 'dddd']

Is is possible to change the std.algorithm.map to a semantics similar to the
Python one, that I think is more useful?

Bye,
bearophile

Oct 24 2010

"Simen kjaeraas" <simen.kjaras gmail.com> writes:

On Sun, 24 Oct 2010 21:39:24 +0200, bearophile <bearophileHUGS lycos.com>  
wrote:

 Tomek S.:

 map!((a) {return myNaryFun(a._0, a._1, ...); })(zip(range1, range2,  
 ...));

 Currently the docs of std.algorithm.map say:

 Multiple functions can be passed to map. In that case, the element type  
 of map  is a tuple containing one element for each function.<

 But lot of time ago I have said that in my opinion that's not the best  
 design. Python has a different design, its map does what you want, you  
 may write your code in Python as:

 map(myNaryFun, range1, range2, ...)

 An example (Python 2.6):

 a = ["a", "b", "c", "d"]
 b = [1, 2, 3, 4]
 map(lambda c,n: c * n, a, b)



 ['a', 'bb', 'ccc', 'dddd']

 Is is possible to change the std.algorithm.map to a semantics similar to  
 the Python one, that I think is more useful?

 From what I can see, map currently simply doesn't support passing it
multiple ranges. It would be a trivial change to let it support multiple
ranges in addition to multiple functions.


-- 
Simen

Oct 24 2010

bearophile <bearophileHUGS lycos.com> writes:

Simen kjaeraas:

  From what I can see, map currently simply doesn't support passing it
 multiple ranges. It would be a trivial change to let it support multiple
 ranges in addition to multiple functions.

If you may have multiple of both then the situation becomes complex. So I think
that single function - multiple ranges is simpler & better than that :-)

Bye,
bearophile

Oct 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/24/10 14:55 CDT, Simen kjaeraas wrote:
 On Sun, 24 Oct 2010 21:39:24 +0200, bearophile
 <bearophileHUGS lycos.com> wrote:

 Tomek S.:

 map!((a) {return myNaryFun(a._0, a._1, ...); })(zip(range1, range2,
 ...));

 Currently the docs of std.algorithm.map say:

 Multiple functions can be passed to map. In that case, the element
 type of map is a tuple containing one element for each function.<

 But lot of time ago I have said that in my opinion that's not the best
 design. Python has a different design, its map does what you want, you
 may write your code in Python as:

 map(myNaryFun, range1, range2, ...)

 An example (Python 2.6):

 a = ["a", "b", "c", "d"]
 b = [1, 2, 3, 4]
 map(lambda c,n: c * n, a, b)



 ['a', 'bb', 'ccc', 'dddd']

 Is is possible to change the std.algorithm.map to a semantics similar
 to the Python one, that I think is more useful?

  From what I can see, map currently simply doesn't support passing it
 multiple ranges. It would be a trivial change to let it support multiple
 ranges in addition to multiple functions.

This is coming full circle. At a point, map _did_ support multiple 
ranges. Some people found that non-modular - if you want multiple 
ranges, you should use map with zip...

Andrei

Oct 24 2010

KennyTM~ <kennytm gmail.com> writes:

On Oct 25, 10 14:04, Andrei Alexandrescu wrote:
 This is coming full circle. At a point, map _did_ support multiple
 ranges. Some people found that non-modular - if you want multiple
 ranges, you should use map with zip...

Except that at "that point"[1], map's multi-range support is equivalent to

     map!(func)(r1, r2, r3...) ==  map!(func)(chain(r1, r2, r3...))

which is hardly useful. OTOH zipWithN is a very useful higher-order 
function.

[1]: 
http://web.archive.org/web/20080417051734/www.digitalmars.com/d/2.0/phobos/std_algorithm.html#map

Oct 24 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 10/25/10 1:42 CDT, KennyTM~ wrote:
 On Oct 25, 10 14:04, Andrei Alexandrescu wrote:
 This is coming full circle. At a point, map _did_ support multiple
 ranges. Some people found that non-modular - if you want multiple
 ranges, you should use map with zip...

 Except that at "that point"[1], map's multi-range support is equivalent to

 map!(func)(r1, r2, r3...) == map!(func)(chain(r1, r2, r3...))

 which is hardly useful. OTOH zipWithN is a very useful higher-order
 function.

 [1]:
 http://web.archive.org/web/20080417051734/www.digitalmars.com/d/2.0/phobos/std_algorithm.html#map

Oh, you're right. Sorry!

Andrei

Oct 24 2010

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= <just ask.me> writes:

Dnia 25-10-2010 o 08:42:31 KennyTM~ <kennytm gmail.com> napisa=B3(a):

 On Oct 25, 10 14:04, Andrei Alexandrescu wrote:
 This is coming full circle. At a point, map _did_ support multiple
 ranges. Some people found that non-modular - if you want multiple
 ranges, you should use map with zip...

 Except that at "that point"[1], map's multi-range support is equivalen=

t  =

 to

      map!(func)(r1, r2, r3...) =3D=3D  map!(func)(chain(r1, r2, r3...)=

)
 which is hardly useful. OTOH zipWithN is a very useful higher-order  =

 function.

 [1]:  =

 http://web.archive.org/web/20080417051734/www.digitalmars.com/d/2.0/ph=

obos/std_algorithm.html#map

You're right. Still, modularity is a valid point. Thinking about it, thi=
s  =

can be a good exercise in static inheritance -- make Zip the base taking=
  =

care of multi-range traversal (stopping policy, etc) and the "derived"  =

ranges (Map, Filter, Reduce) would only implement element accessors,  =

transformation, and so on.

-- =

Tomek

Oct 25 2010

=?iso-8859-2?B?VG9tZWsgU293afFza2k=?= <just ask.me> writes:

Dnia 25-10-2010 o 21:20:24 Tomek Sowi=F1ski <just ask.me> napisa=B3(a):

 (Map, Filter, Reduce)

reduce is of course not a range ;)

-- =

Tomek

Oct 25 2010

bearophile <bearophileHUGS lycos.com> writes:

Tomek S.:

 You're right. Still, modularity is a valid point.

This is a general and interesting design topic. The short answer is that a real
language is designed taking into account different trade-offs.

Modularity is important, and Andrei surely likes it. Modularity means the
ability to combine small parts to create many different complex systems. To do
this the subsystems must interact clearly with each other, they must lack too
many corner cases, and of course they need a common flexible interface. As I
have recently read, in a modular system "the surprises will be pleasant ones".

On the other hand modularity has some disadvantages too. To be clean those
interfaces among subsystems must often be hi-level, this means they may be not
fully efficient (unless your compiler is very smart, as the most optimizing
compiler, the Stalin Scheme compiler). Another disadvantage is that you have to
build common higher order systems over and over again, and sometimes you need
some brain and time to build them well.

To avoid this last problem a well designed system usually offers you pre-built
higher order systems, made with the elementary components, that you may just
use. In some situations very common usage patterns deserve to be encapsulated
into handy forms.

So, if you have one of the most used high order functions (map), if its usage
often enough deals with mapping functions that work on more than one range of
items for its arguments, and if the current way to perform this operation from
elementary components produces a not so readable syntax, then practicality asks
for for a less modular but more handy design of map :-)

Few more examples of this trade-off below.

----------------

I have even (weakly) asked for amap()/afilter() functions, that are just
array(map(...)) and array(filter(...)), because those are very common patterns
in my code. I am still not sure adding amap/afilter to Phobos is a good idea,
it just reduces the number of parenthesis a little, so it decreases noise in
the code.

----------------

Recently in Bugzilla I have asked for sorted()/schwartzSorted(), in this case
my desire is stronger, because they save a bit more than just few chars, they
are expressions:
http://d.puremagic.com/issues/show_bug.cgi?id=5076

----------------

A similar thread has recently come up in the D.learn newsgroup:
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.learn&article_id=22413

This person has asked for a way to remove the first item in an array given not
the item index (found with indexOf), but the item itself.

Python2 too has those two different ways to remove an item from a list (array),
by inded and by item:

 arr = ["a", "b", "c", "d"]
 arr.remove("x")



Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
 arr.index("c")



2
 arr.remove("c")
 arr



['a', 'b', 'd']
 del arr[1]
 arr



['a', 'd']


This composed operation (list.remove in Python) is useful in D both because
it's a common need, and because it saves the testing code in the middle
(between the index and the del, to be sure the item is present). Well, the test
is done anyway, but it's done after the single operation, this is a little
better.

In Python the list.remove raises an exception if the item is missing. In D this
function may throw an exception or return a false boolean (exceptions are
safer, but less efficient, this is another trade-off. In some situations you
may even add both functions, one that throws exceptions and one that returns
true/false).

In D a combined function like the Python list.remove() one also avoids possible
troubles caused by the fact that indexOf() returns a signed integer, and while
the delete function accepts signed integers too, then I think inside it
compares them to a unsigned length and this may cause troubles :-(

You may compile this wrong program in release mode, to see it:


import std.algorithm: remove, indexOf;
import std.stdio: writeln;
void main() {
    int[] a1 = [3, 5, 7, 8];
    auto pos = a1.indexOf(11);
    writeln(pos); // -1
    int[] a2 = a1.remove(pos);
    writeln(a2);
}


Partially related problem, this is part of the docs of std.algorithm:

The original array has remained of the same length because all functions in
std.algorithm only change content, not topology. The value 8 is repeated
because std.algorithm.move  was invoked to move elements around and on integers
move simply copies the source to the destination. To replace a with the effect
of the removal, simply assign a = remove(a, 1). The slice will be rebound to
the shorter array and the operation completes with maximal efficiency.<

This is bug-prone, but maybe there is no way to avoid this behaviour, because
in D arrays aren't true reference types.

Bye,
bearophile

Oct 25 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Improving std.range.Zip