digitalmars.D - A Small Contribution to Phobos

Meta (91/91) Jun 01 2013 I saw a thread a few days ago about somebody wanting a few

Jonathan M Davis (48/154) Jun 01 2013 For reference type ranges and input ranges which are not forward ranges,...

Brad Anderson (14/33) Jun 01 2013 Calling front is kind of the point of exhaust(), otherwise you'd
Meta (77/175) Jun 01 2013 I originally wrote it to accept forward ranges and use save, but

Andrei Alexandrescu (9/15) Jun 02 2013 [snip]

bearophile (24/25) Jun 02 2013 I have to shot this down for many reasons:

Andrei Alexandrescu (12/25) Jun 02 2013 Au contraire, there are many advantages. Using "reduce" leverages a

Meta (3/7) Jun 02 2013 Maybe, then, it would be best to have a template that calls
monarch_dodra (31/50) Jun 02 2013 One of the problems with using "map" for something such as this,

Andrei Alexandrescu (23/57) Jun 02 2013 I think there is one answer that arguably narrows the design space

monarch_dodra (31/115) Jun 02 2013 I think I just had a good idea. First, we introduce "cached":

Diggory (10/137) Jun 02 2013 I like the idea of "cached" and it's certainly useful if you need
Andrei Alexandrescu (6/17) Jun 02 2013 Yah, cached() (better cache()?) should be nice. It may also offer

monarch_dodra (67/92) Jun 02 2013 Hum... That'd be a whole different ballpark in terms of power, as

irritate (46/47) Jun 16 2013 Hello.

monarch_dodra (24/31) Jun 16 2013 What made you change the parameter of :

irritate (26/42) Jun 16 2013 Actually the original version was pipeOnPop = true by default.

Jonathan M Davis (5/27) Jun 16 2013 I would think that pipeOnPop would be better by default simply because i...

monarch_dodra (3/3) Jun 22 2013 I have implemented and submitted "cache".

deadalnix (2/21) Jun 02 2013 map being lazy, this can really do all kind of different stuff.

Brad Anderson (13/104) Jun 01 2013 You may find this forum discussion from several months ago

bearophile (28/64) Jun 02 2013 I'd like something like this in Phobos, but I'd like it to have a

"Meta" <jared771 gmail.com> writes:

I saw a thread a few days ago about somebody wanting a few 
UFCS-based convenience functions, so I thought that I'd take the 
opportunity to make a small contribution to Phobos. Currently I 
have four small functions: each, exhaust, perform, and tap, and 
would like some feedback.

each is designed to perform operations with side-effects on each 
range element. To actually change the elements of the range, each 
element must be accepted by reference.

Range each(alias fun, Range)(Range r)
if (isInputRange!(Unqual!Range))
{
     alias unaryFun!fun _fun;
     foreach (ref e; r)
     {
         fun(e);
     }

     return r;
}

//Prints [-1, 0, 1]
[1, 2, 3].each!((ref i) => i -= 2).writeln;


exhaust iterates a range until it is exhausted. It also has the 
nice feature that if range.front is callable, exhaust will call 
it upon each iteration.

Range exhaust(Range)(Range r)
if (isInputRange!(Unqual!Range))
{

     while (!r.empty)
     {
         r.front();
         r.popFront();
     }

     return r;
}

//Writes "www.dlang.org". x is an empty MapResult range.
auto x = "www.dlang.org"
          .map!((c) { c.write; return false; })
          .exhaust;

//Prints []
[1, 2, 3].exhaust.writeln;


perform is pretty badly named, but I couldn't come up with a 
better one. It can be inserted in a UFCS chain and perform some 
operation with side-effects. It doesn't alter its argument, just 
returns it for the next function in the chain.

T perform(alias dg, T)(ref T val)
{
     dg();

     return val;
}

//Prints "Mapped: 2 4"
[1, 2, 3, 4, 5]
.filter!(n => n < 3)
.map!(n => n * n)
.perform!({write("Mapped: ");})
.each!(n => write(n, " "));


Lastly is tap, which takes a value and performs some mutating 
operation on it. It then returns the value.

T tap(alias dg, T)(auto ref T val)
{
     dg(val);

     return val;
}

class Foo
{
     int x;
     int y;
}

auto f = (new Foo).tap!((f)
{
     f.x = 2;
     f.y = 3;
});

//Prints 2 3
writeln(f.x, " ", f.y);

struct Foo2
{
     int x;
     int y;
}

//Need to use ref for value types
auto f2 = Foo2().tap!((ref f)
{
     f.x = 3;
     f.y = 2;
});

//Prints 3 2
writeln(f2.x, " ", f2.y);


Do you think these small functions have a place in Phobos? I 
think each and exhaust would be best put into std.range, but I'm 
not quite sure where perform and tap should go. Also, there's 
that horrible name for perform, for which I would like to come up 
with a better name.

Jun 01 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, June 02, 2013 04:57:53 Meta wrote:
 I saw a thread a few days ago about somebody wanting a few
 UFCS-based convenience functions, so I thought that I'd take the
 opportunity to make a small contribution to Phobos. Currently I
 have four small functions: each, exhaust, perform, and tap, and
 would like some feedback.
 
 each is designed to perform operations with side-effects on each
 range element. To actually change the elements of the range, each
 element must be accepted by reference.
 
 Range each(alias fun, Range)(Range r)
 if (isInputRange!(Unqual!Range))
 {
      alias unaryFun!fun _fun;
      foreach (ref e; r)
      {
          fun(e);
      }
 
      return r;
 }
 
 //Prints [-1, 0, 1]
 [1, 2, 3].each!((ref i) => i -= 2).writeln;

For reference type ranges and input ranges which are not forward ranges, this 
will consume the range and return nothing. It would have to accept only 
forward ranges and save the result before iterating over it. Also, range-based 
functions should not be strict (i.e. not lazy) without good reason. And I 
don't see much reason to make this strict. Also, it's almost the same thing as 
map. Why not just use map? The predicate can simply return the same value 
after it's operated on it.

If we did add this, I'd argue that transform is a better name, but I'm still 
inclined to think that it's not worth adding.

 exhaust iterates a range until it is exhausted. It also has the
 nice feature that if range.front is callable, exhaust will call
 it upon each iteration.
 
 Range exhaust(Range)(Range r)
 if (isInputRange!(Unqual!Range))
 {
 
      while (!r.empty)
      {
          r.front();
          r.popFront();
      }
 
      return r;
 }
 
 //Writes "www.dlang.org". x is an empty MapResult range.
 auto x = "www.dlang.org"
           .map!((c) { c.write; return false; })
           .exhaust;
 
 //Prints []
 [1, 2, 3].exhaust.writeln;

The callable bit won't work. It'll just call front. You'd have to do something 
like

static if(isCallable!(ElementType!R))
    r.front()();

Also, if front were pure, then calling it and doing nothing with its return 
value would result in a compilation error. The same goes if the element type 
is a pure callable. And even if this did work exactly as you intended. I think 
that assuming that someone exhausting the range would would what front returns 
to be called is a bad idea. Maybe they do, maybe they don't, I'd expect that 
in most cases, they wouldn't. If that's what they want, they can call map 
before calling exhaust.

 perform is pretty badly named, but I couldn't come up with a
 better one. It can be inserted in a UFCS chain and perform some
 operation with side-effects. It doesn't alter its argument, just
 returns it for the next function in the chain.
 
 T perform(alias dg, T)(ref T val)
 {
      dg();
 
      return val;
 }
 
 //Prints "Mapped: 2 4"
 [1, 2, 3, 4, 5]
 .filter!(n => n < 3)
 .map!(n => n * n)
 .perform!({write("Mapped: ");})
 .each!(n => write(n, " "));

So, you want to have a function which you pass something (including a range) 
and then returns that same value after calling some other function? Does this 
really buy you much over just splitting up the expression - you're already 
giving a multline example anyway.

auto foo = [1, 2, 3, 4, 4].filt!(n => n < 3)().map!(n => n * n)();
write("Mapped: ");
foo.each!(n => write(n, "")();

And I think that this is a perfect example of something that should just be 
done with foreach anyway. Not to mention, if you're calling very many 
functions, you're going to need to use multiple lines, in which case chaining 
the functions like that doesn't buy you much. All you end up doing is taking 
what would normally be a sequence of statements and turned it into one 
multiline statement. I don't think that this buys us much, especially when 
it's just calling one function which does nothing on any object in the chain.

 Lastly is tap, which takes a value and performs some mutating
 operation on it. It then returns the value.
 
 T tap(alias dg, T)(auto ref T val)
 {
      dg(val);
 
      return val;
 }
 
 class Foo
 {
      int x;
      int y;
 }
 
 auto f = (new Foo).tap!((f)
 {
      f.x = 2;
      f.y = 3;
 });
 
 //Prints 2 3
 writeln(f.x, " ", f.y);
 
 struct Foo2
 {
      int x;
      int y;
 }
 
 //Need to use ref for value types
 auto f2 = Foo2().tap!((ref f)
 {
      f.x = 3;
      f.y = 2;
 });
 
 //Prints 3 2
 writeln(f2.x, " ", f2.y);

Why do you need tap? So that you can use an anonymous function? If it had a 
name, you'd just use it with UFCS. I'd argue that this use case is minimal 
enough that you might as well just give it a name and then use UFCS if you 
really want to use UFCS, and if you want an anonymous function, what's the 
real gain of chaining it with UFCS anyway? It makes the expression much harder 
to read if you try and chain calls on the anonymous function.

UFCS' main purpose is making it so that a function can be called on multiple 
types in the same manner (particularly where it could be a member function in 
some cases and a free function in others), and it just so happens to make 
function chaining cleaner in some cases. But there's no reason to try and turn 
all function calls in UFCS calls, and I think that perform and tap are taking 
it too far.

- Jonathan M Davis

Jun 01 2013

"Brad Anderson" <eco gnuk.net> writes:

On Sunday, 2 June 2013 at 04:10:15 UTC, Jonathan M Davis wrote:
 On Sunday, June 02, 2013 04:57:53 Meta wrote:
 The callable bit won't work. It'll just call front. You'd have 
 to do something
 like

 static if(isCallable!(ElementType!R))
     r.front()();

 Also, if front were pure, then calling it and doing nothing 
 with its return
 value would result in a compilation error. The same goes if the 
 element type
 is a pure callable.

Calling front is kind of the point of exhaust(), otherwise you'd 
use takeNone().  You wouldn't use this if front were pure because 
the only reason you'd want exhaust is if you were (ab)using side 
effects (like I was the other day on D.learn).  Having it error 
out if you were using it on a range with pure front() is actually 
a good thing because you've made some error in your reasoning if 
you think you want exhaust() to run in that situation. 
processSideEffects() is probably too long of name.

  And even if this did work exactly as you intended. I think
 that assuming that someone exhausting the range would would 
 what front returns
 to be called is a bad idea. Maybe they do, maybe they don't, 
 I'd expect that
 in most cases, they wouldn't. If that's what they want, they 
 can call map
 before calling exhaust.

Sticking a map before exhaust without it calling front() would 
accomplish nothing. I know this because my own little toy eat() 
just called popFront() originally on a Map range and nothing 
happened.  You'd be skipping map's function if you don't call 
front.

Jun 01 2013

"Meta" <jared771 gmail.com> writes:

 For reference type ranges and input ranges which are not 
 forward ranges, this
 will consume the range and return nothing.

I originally wrote it to accept forward ranges and use save, but 
I wanted to make it as inclusive as possible. I guess I 
overlooked the case of ref ranges. As for ranges that aren't 
forward ranges, consider a simple input range.

struct InputRange
{
     int[] arr = [1, 2, 3, 4, 5];

     int front() { return arr.front; }

     bool empty() { return arr.empty; }

     void popFront() { return arr.popFront; }
}

writeln(isForwardRange!InputRange); //False

Range()
.each!(n => write(n, " "))
.map!(n => n * n)
.writeln;

This outputs 1 2 3 4 5 [1, 4, 9, 16, 25], so each is not 
returning an empty range. I believe this is because r in this 
case is a value type range, and the foreach loop makes a copy of 
it. This does still leave the problem of reference type ranges.

 Also, range-based
 functions should not be strict (i.e. not lazy) without good 
 reason. And I
 don't see much reason to make this strict.

It's not lazy because it's intended to perform some mutating or 
otherwise side-effectful operation. Map doesn't play well with 
side effects, partially because of its laziness. A very contrived 
example:

auto arr = [1, 2, 3, 4].map!(n => n.writeln); //Now what?

It's not clear now what to do with the result. You could try a 
for loop:

foreach (n; arr) n(); //Error: n cannot be of type void

But that doesn't work. A solution would be to modify the function 
you pass to map:

auto arr = [1, 2, 3, 4].map!((n) { n.writeln; return n; });

foreach (n; arr) {} //Prints 1 2 3 4

But that's both ugly and verbose. each also has the advantage of 
being able to return the original range (possibly modified), 
whereas map must return a MapResult due to its laziness, and you 
need that extra array call to bludgeon it into the correct form. 
each is also more efficient in that it doesn't need to return a 
copy of the data passed to it. It simply mutates it in-place.

 Also, it's almost the same thing as map. Why not just use map? 
 The predicate can simply return the same value
 after it's operated on it.

See above. There are some cases where map is clunky to work with 
due to it being non-strict.

 If we did add this, I'd argue that transform is a better name, 
 but I'm still
 inclined to think that it's not worth adding.

I chose the name each because it's a common idiom in a couple of 
other languages (Javascript, Ruby and Rust off the top of my 
head), and because I think it underlines the fact that each is 
meant to perform side-effectful operations.

 exhaust iterates a range until it is exhausted. It also has the
 nice feature that if range.front is callable, exhaust will call
 it upon each iteration.
 
 Range exhaust(Range)(Range r)
 if (isInputRange!(Unqual!Range))
 {
 
      while (!r.empty)
      {
          r.front();
          r.popFront();
      }
 
      return r;
 }
 
 //Writes "www.dlang.org". x is an empty MapResult range.
 auto x = "www.dlang.org"
           .map!((c) { c.write; return false; })
           .exhaust;
 
 //Prints []
 [1, 2, 3].exhaust.writeln;

 The callable bit won't work. It'll just call front. You'd have 
 to do something
 like

 static if(isCallable!(ElementType!R))
     r.front()();

I was having some trouble with writing exhaust and forgot all 
about ElementType. I'll change that.

 Also, if front were pure, then calling it and doing nothing 
 with its return
 value would result in a compilation error. The same goes if the 
 element type
 is a pure callable.

Is this true for all pure functions? That seems like kind of 
strange behaviour to me, and doesn't really make sense given the 
definition of functional purity.

 And even if this did work exactly as you intended. I think
 that assuming that someone exhausting the range would would 
 what front returns
 to be called is a bad idea. Maybe they do, maybe they don't, 
 I'd expect that
 in most cases, they wouldn't. If that's what they want, they 
 can call map
 before calling exhaust.

I think the original reason that somebody wanted exhaust was 
because map is lazy and they wanted a function which could take 
the result of map and consume it while calling front each time. 
Otherwise, there wouldn't be much reason to have this, as there 
is takeNone and popFrontN.

 So, you want to have a function which you pass something 
 (including a range)
 and then returns that same value after calling some other 
 function? Does this
 really buy you much over just splitting up the expression - 
 you're already
 giving a multline example anyway.

It gives you the advantage of not having to split your UFCS chain 
up, which I personally find valuable, and I think other people 
would as well. I think it's quite similar to the various 
side-effectful monads in Haskell, which don't do anything with 
their argument other than return it, but perform some operation 
with side-effects in the process. I'll try to think up a better 
example for this, because I think it can be quite useful in 
certain circumstances.

 And I think that this is a perfect example of something that 
 should just be
 done with foreach anyway. Not to mention, if you're calling 
 very many
 functions, you're going to need to use multiple lines, in which 
 case chaining
 the functions like that doesn't buy you much. All you end up 
 doing is taking
 what would normally be a sequence of statements and turned it 
 into one
 multiline statement. I don't think that this buys us much, 
 especially when
 it's just calling one function which does nothing on any object 
 in the chain.

See above. I think there is quite a high value in not having to 
define extra variables and split up your UFCS chain halfway 
through, which which is somewhat obfuscatory, I think.

 Why do you need tap? So that you can use an anonymous function? 
 If it had a
 name, you'd just use it with UFCS. I'd argue that this use case 
 is minimal
 enough that you might as well just give it a name and then use 
 UFCS if you
 really want to use UFCS, and if you want an anonymous function, 
 what's the
 real gain of chaining it with UFCS anyway? It makes the 
 expression much harder
 to read if you try and chain calls on the anonymous function.

One way in which tap can be useful is that you can perform some 
operations on data in the middle of a UFCS chain and then go 
about your business. It's probably the least useful of the four.

 UFCS' main purpose is making it so that a function can be 
 called on multiple
 types in the same manner (particularly where it could be a 
 member function in
 some cases and a free function in others), and it just so 
 happens to make
 function chaining cleaner in some cases. But there's no reason 
 to try and turn
 all function calls in UFCS calls, and I think that perform and 
 tap are taking
 it too far.

Personally, I prefer function-chaining style more, as I think 
it's more aesthetic and more amenable to Walter's notion of 
component programming. For someone who doesn't use UFCS that 
much, these functions will seem almost useless, as their entire 
functionality can be duplicated by using some other construct. I 
wrote them to allow more versatile UFCS chains, so you don't have 
to break them up.

I think that people who heavily use UFCS, on the other hand, will 
find these quite useful in different situations.

Jun 01 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/2/13 1:58 AM, Meta wrote:
 For reference type ranges and input ranges which are not forward
 ranges, this
 will consume the range and return nothing.

 I originally wrote it to accept forward ranges and use save, but I
 wanted to make it as inclusive as possible. I guess I overlooked the
 case of ref ranges.

[snip]

Thanks for sharing your ideas.

I think consuming all of a range evaluating front and doing nothing 
should be the role of reduce with only one parameter (the range). That 
overload would take the range to be "exhausted" and return void. Thus 
your example becomes:

[1, 2, 3, 4].map!(n => n.writeln).reduce;


Andrei

Jun 02 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 [1, 2, 3, 4].map!(n => n.writeln).reduce;

I have to shot this down for many reasons:

I think it's better to give that final function a different name 
(like "consume" or something like that) because it's used for 
very different purposes and it returns nothing. Re-using the name 
"reduce" doesn't reduce the amount of Phobos lines of code, it 
doesn't make the user code simpler to understand, it's more 
obscure because it's more semantically overloaded, and it's not 
more easy to find in the documentation by the future D users. 
Function names are not language keywords, packing different 
purposes in the same name as "static" doesn't give any advantage, 
and only disadvantages.

And using map with a lambda that returns nothing is not a style I 
like :-( It's probably better to encourage D programmers to give 
pure lambdas to map/filter, for several reasons (safety, 
cleanness, code style, future D front-end optimizations done on 
those higher order functions, to allow a better debuggability, 
and to avoid Phobos bugs like 
http://d.puremagic.com/issues/show_bug.cgi?id=9674 ). So I think 
it's better to introduce a new Phobos function like tap() that 
accepts a function/delegate with side effects that takes no input 
arguments.

Bye,
bearophile

Jun 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/2/13 9:20 AM, bearophile wrote:
 Andrei Alexandrescu:

 [1, 2, 3, 4].map!(n => n.writeln).reduce;

 I have to shot this down for many reasons:

 I think it's better to give that final function a different name (like
 "consume" or something like that) because it's used for very different
 purposes and it returns nothing. Re-using the name "reduce" doesn't
 reduce the amount of Phobos lines of code, it doesn't make the user code
 simpler to understand, it's more obscure because it's more semantically
 overloaded, and it's not more easy to find in the documentation by the
 future D users.

Au contraire, there are many advantages. Using "reduce" leverages a 
well-understood notion instead of introducing a new one. There is less 
need for documentation, motivation, and explanations. "Reduce with no 
function simply spans the entire range." Builds on an already-eager 
construct par excellence instead of adding a new one that must be 
remembered and distinguished from the lazy constructs.

Actually my first thought when I saw consume() was to look up reduce, 
thinking, "how do I reduce a range to nothing"? Because that's the goal. 
Reduce is the obvious choice here.

 Function names are not language keywords, packing different purposes
 in the same name as "static" doesn't give any advantage, and only
 disadvantages.

Strawman argument.


Andrei

Jun 02 2013

"Meta" <jared771 gmail.com> writes:

 I think consuming all of a range evaluating front and doing 
 nothing should be the role of reduce with only one parameter 
 (the range). That overload would take the range to be 
 "exhausted" and return void. Thus your example becomes:

Maybe, then, it would be best to have a template that calls 
reduce in such a way, that makes it perfectly clear what is 
happening.

Jun 02 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote:
 On 6/2/13 1:58 AM, Meta wrote:
 For reference type ranges and input ranges which are not 
 forward
 ranges, this
 will consume the range and return nothing.

 I originally wrote it to accept forward ranges and use save, 
 but I
 wanted to make it as inclusive as possible. I guess I 
 overlooked the
 case of ref ranges.

 [snip]

 Thanks for sharing your ideas.

 I think consuming all of a range evaluating front and doing 
 nothing should be the role of reduce with only one parameter 
 (the range). That overload would take the range to be 
 "exhausted" and return void. Thus your example becomes:

 [1, 2, 3, 4].map!(n => n.writeln).reduce;


 Andrei

One of the problems with using "map" for something such as this, 
is that the resulting object is not a range, since "front" now 
returns void, and a range *must* return a value. So that code 
will never compile (since reduce will ask for at least input 
range). Heck, I think we should make it so that map refuses to 
compile with an operator that returns void. It doesn't make much 
sense as-is.

Usage has to be something like:

map!((n) {n.writeln; return n;})

which is quite clunky. The idea of a "tee" range, that takes n, 
runs an operation on it, and then returns said n as is becomes 
really very useful (and more idiomatic). [1, 2, 3, 4].tee!(n => 
n.writeln). There! perfect :)

I've dabbled in implementing such a function, but there are 
conceptual problems: If the user calls "front" twice in a row, 
then should "fun" be called twice? If user popsFront without 
calling front, should "fun" be called at all?

Should it keep track of calls, to guarantee 1, and only 1, call 
on each element?

I'm not sure there is a correct answer to that, which is one of 
the reasons I haven't actually submitted anything.

--------

I don't think "argument-less reduce" should do what you describe, 
as it would be a bit confusing what the function does. 1-names; 
1-operation, IMO. Users might accidentally think they are getting 
an additive reduction :(

I think a function called "walk", in line with "walkLength", 
would be much more appropriate, and make more sense to boot!

But we run into the same problem... Should "walk" call front 
between each element? Both answers are correct, IMO.

Jun 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/2/13 11:41 AM, monarch_dodra wrote:
 On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote:
 [1, 2, 3, 4].map!(n => n.writeln).reduce;


 Andrei

 One of the problems with using "map" for something such as this, is that
 the resulting object is not a range, since "front" now returns void, and
 a range *must* return a value. So that code will never compile (since
 reduce will ask for at least input range). Heck, I think we should make
 it so that map refuses to compile with an operator that returns void. It
 doesn't make much sense as-is.

Hm, interesting. I'm destroyed.

 Usage has to be something like:

 map!((n) {n.writeln; return n;})

 which is quite clunky. The idea of a "tee" range, that takes n, runs an
 operation on it, and then returns said n as is becomes really very
 useful (and more idiomatic). [1, 2, 3, 4].tee!(n => n.writeln). There!
 perfect :)

 I've dabbled in implementing such a function, but there are conceptual
 problems: If the user calls "front" twice in a row, then should "fun" be
 called twice? If user popsFront without calling front, should "fun" be
 called at all?

 Should it keep track of calls, to guarantee 1, and only 1, call on each
 element?

 I'm not sure there is a correct answer to that, which is one of the
 reasons I haven't actually submitted anything.

I think there is one answer that arguably narrows the design space 
appropriately: just like the Unix utility, tee should provide a hook 
that creates an exact replica of the (portion of the) range being 
iterated. So calling front several times is nicely out of the picture. 
The remaining tactical options are:

1. evaluate .front for the parent range once in its constructor and then 
every time right after forwarding popFront() to the parent range. This 
is a bit "eager" because the constructor evaluates .front even if the 
client never does.

2. evaluate .front for the parent range just before forwarding 
popFront() to parent. This will call front even though the client 
doesn't (which I think is fine).

3. keep a bool that is set by constructor and popFront() and reset by 
front(). The bool makes sure front() is called if and only if the client 
calls it.

I started writing the options mechanically without thinking of the 
implications. Now that I'm done, I think 2 is by far the best.

 --------

 I don't think "argument-less reduce" should do what you describe, as it
 would be a bit confusing what the function does. 1-names; 1-operation,
 IMO. Users might accidentally think they are getting an additive
 reduction :(

Good point.

 I think a function called "walk", in line with "walkLength", would be
 much more appropriate, and make more sense to boot!

 But we run into the same problem... Should "walk" call front between
 each element? Both answers are correct, IMO.

That's why I'm thinking: the moment .front gets evaluated, we get into 
the realm of reduce.


Andrei

Jun 02 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 2 June 2013 at 16:55:23 UTC, Andrei Alexandrescu wrote:
 On 6/2/13 11:41 AM, monarch_dodra wrote:
 On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu 
 wrote:
 [1, 2, 3, 4].map!(n => n.writeln).reduce;


 Andrei

 One of the problems with using "map" for something such as 
 this, is that
 the resulting object is not a range, since "front" now returns 
 void, and
 a range *must* return a value. So that code will never compile 
 (since
 reduce will ask for at least input range). Heck, I think we 
 should make
 it so that map refuses to compile with an operator that 
 returns void. It
 doesn't make much sense as-is.

 Hm, interesting. I'm destroyed.

 Usage has to be something like:

 map!((n) {n.writeln; return n;})

 which is quite clunky. The idea of a "tee" range, that takes 
 n, runs an
 operation on it, and then returns said n as is becomes really 
 very
 useful (and more idiomatic). [1, 2, 3, 4].tee!(n => 
 n.writeln). There!
 perfect :)

 I've dabbled in implementing such a function, but there are 
 conceptual
 problems: If the user calls "front" twice in a row, then 
 should "fun" be
 called twice? If user popsFront without calling front, should 
 "fun" be
 called at all?

 Should it keep track of calls, to guarantee 1, and only 1, 
 call on each
 element?

 I'm not sure there is a correct answer to that, which is one 
 of the
 reasons I haven't actually submitted anything.

 I think there is one answer that arguably narrows the design 
 space appropriately: just like the Unix utility, tee should 
 provide a hook that creates an exact replica of the (portion of 
 the) range being iterated. So calling front several times is 
 nicely out of the picture. The remaining tactical options are:

 1. evaluate .front for the parent range once in its constructor 
 and then every time right after forwarding popFront() to the 
 parent range. This is a bit "eager" because the constructor 
 evaluates .front even if the client never does.

 2. evaluate .front for the parent range just before forwarding 
 popFront() to parent. This will call front even though the 
 client doesn't (which I think is fine).

 3. keep a bool that is set by constructor and popFront() and 
 reset by front(). The bool makes sure front() is called if and 
 only if the client calls it.

 I started writing the options mechanically without thinking of 
 the implications. Now that I'm done, I think 2 is by far the 
 best.

I think I just had a good idea. First, we introduce "cached": 
cached will take the result of front, but only evaluate it once. 
This is a good idea in and out of itself, and should take the 
place of ".array()" in UFCS chains. It can store the result of an 
operation, but keeps the lazy iteration semantic. That's a win 
for functional programming right there.

It would be most convenient right after an expansive call, such 
as after a map or whatnot.

The semantic of "cached" would be:
"eagerly calls front once, always once, and exactly once, and 
stores the result. Calling front on cached returns said result. 
calling popFront repeats operation".

 From there, "tee", is nothing more than "calls funs on the front 
element every time front is called, then returns front".

 From there, users can user either of:

MyRange.tee!foo(): This calls foo on every front element, and 
several times is front gets called several times.
MyRange.tee!foo().cached(): This calls foo on every front 
element, but only once, and guaranteed at least once, if it gets 
iterated.

 --------

 I don't think "argument-less reduce" should do what you 
 describe, as it
 would be a bit confusing what the function does. 1-names; 
 1-operation,
 IMO. Users might accidentally think they are getting an 
 additive
 reduction :(

 Good point.

 I think a function called "walk", in line with "walkLength", 
 would be
 much more appropriate, and make more sense to boot!

 But we run into the same problem... Should "walk" call front 
 between
 each element? Both answers are correct, IMO.

 That's why I'm thinking: the moment .front gets evaluated, we 
 get into the realm of reduce.

Combined with my "cached" proposal, the problem is solved I 
think: "walk" does not call front, it merely pops. But, if 
combined with cached, then cache *will*, call front. Once and 
exactly once.

This will call foo on all elements of my range (once exactly 
once):

MyRange.tee!foo().cached().walk();

--------

Unless I'm missing something, it looks like a sweet spot between 
functionality, modularity, and even efficiency...?

Jun 02 2013

"Diggory" <diggsey googlemail.com> writes:

On Sunday, 2 June 2013 at 18:43:44 UTC, monarch_dodra wrote:
 On Sunday, 2 June 2013 at 16:55:23 UTC, Andrei Alexandrescu 
 wrote:
 On 6/2/13 11:41 AM, monarch_dodra wrote:
 On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu 
 wrote:
 [1, 2, 3, 4].map!(n => n.writeln).reduce;


 Andrei

 One of the problems with using "map" for something such as 
 this, is that
 the resulting object is not a range, since "front" now 
 returns void, and
 a range *must* return a value. So that code will never 
 compile (since
 reduce will ask for at least input range). Heck, I think we 
 should make
 it so that map refuses to compile with an operator that 
 returns void. It
 doesn't make much sense as-is.

 Hm, interesting. I'm destroyed.

 Usage has to be something like:

 map!((n) {n.writeln; return n;})

 which is quite clunky. The idea of a "tee" range, that takes 
 n, runs an
 operation on it, and then returns said n as is becomes really 
 very
 useful (and more idiomatic). [1, 2, 3, 4].tee!(n => 
 n.writeln). There!
 perfect :)

 I've dabbled in implementing such a function, but there are 
 conceptual
 problems: If the user calls "front" twice in a row, then 
 should "fun" be
 called twice? If user popsFront without calling front, should 
 "fun" be
 called at all?

 Should it keep track of calls, to guarantee 1, and only 1, 
 call on each
 element?

 I'm not sure there is a correct answer to that, which is one 
 of the
 reasons I haven't actually submitted anything.

 I think there is one answer that arguably narrows the design 
 space appropriately: just like the Unix utility, tee should 
 provide a hook that creates an exact replica of the (portion 
 of the) range being iterated. So calling front several times 
 is nicely out of the picture. The remaining tactical options 
 are:

 1. evaluate .front for the parent range once in its 
 constructor and then every time right after forwarding 
 popFront() to the parent range. This is a bit "eager" because 
 the constructor evaluates .front even if the client never does.

 2. evaluate .front for the parent range just before forwarding 
 popFront() to parent. This will call front even though the 
 client doesn't (which I think is fine).

 3. keep a bool that is set by constructor and popFront() and 
 reset by front(). The bool makes sure front() is called if and 
 only if the client calls it.

 I started writing the options mechanically without thinking of 
 the implications. Now that I'm done, I think 2 is by far the 
 best.

 I think I just had a good idea. First, we introduce "cached": 
 cached will take the result of front, but only evaluate it 
 once. This is a good idea in and out of itself, and should take 
 the place of ".array()" in UFCS chains. It can store the result 
 of an operation, but keeps the lazy iteration semantic. That's 
 a win for functional programming right there.

 It would be most convenient right after an expansive call, such 
 as after a map or whatnot.

 The semantic of "cached" would be:
 "eagerly calls front once, always once, and exactly once, and 
 stores the result. Calling front on cached returns said result. 
 calling popFront repeats operation".

 From there, "tee", is nothing more than "calls funs on the 
 front element every time front is called, then returns front".

 From there, users can user either of:

 MyRange.tee!foo(): This calls foo on every front element, and 
 several times is front gets called several times.
 MyRange.tee!foo().cached(): This calls foo on every front 
 element, but only once, and guaranteed at least once, if it 
 gets iterated.

 --------

 I don't think "argument-less reduce" should do what you 
 describe, as it
 would be a bit confusing what the function does. 1-names; 
 1-operation,
 IMO. Users might accidentally think they are getting an 
 additive
 reduction :(

 Good point.

 I think a function called "walk", in line with "walkLength", 
 would be
 much more appropriate, and make more sense to boot!

 But we run into the same problem... Should "walk" call front 
 between
 each element? Both answers are correct, IMO.

 That's why I'm thinking: the moment .front gets evaluated, we 
 get into the realm of reduce.

 Combined with my "cached" proposal, the problem is solved I 
 think: "walk" does not call front, it merely pops. But, if 
 combined with cached, then cache *will*, call front. Once and 
 exactly once.

 This will call foo on all elements of my range (once exactly 
 once):

 MyRange.tee!foo().cached().walk();

 --------

 Unless I'm missing something, it looks like a sweet spot 
 between functionality, modularity, and even efficiency...?

I like the idea of "cached" and it's certainly useful if you need 
to iterate a range multiple times or something like that, but I 
also think that 90% of the time the user is just going to want to 
do something simple such as printing every element, and I think 
the syntax "tee!(x => writeln(x)).cached.walk();" is both 
unnecessarily long and less efficient than simply:
consume!(x => writeln(x)); // Template parameter is optional

"consume" would always call front once per element even if no 
function is specified.

Jun 02 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 6/2/13 2:43 PM, monarch_dodra wrote:
 I think I just had a good idea. First, we introduce "cached": cached
 will take the result of front, but only evaluate it once. This is a good
 idea in and out of itself, and should take the place of ".array()" in
 UFCS chains.

Yah, cached() (better cache()?) should be nice. It may also offer 
lookahead, e.g. cache(5) would offer a non-standard lookahead(size_t n) 
up to 5 elements ahead.

  From there, "tee", is nothing more than "calls funs on the front
 element every time front is called, then returns front".

  From there, users can user either of:

 MyRange.tee!foo(): This calls foo on every front element, and several
 times is front gets called several times.
 MyRange.tee!foo().cached(): This calls foo on every front element, but
 only once, and guaranteed at least once, if it gets iterated.

I kinda dislike that tee() is hardly useful without cache.


Andrei

Jun 02 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Monday, 3 June 2013 at 02:31:00 UTC, Andrei Alexandrescu wrote:
 On 6/2/13 2:43 PM, monarch_dodra wrote:
 I think I just had a good idea. First, we introduce "cached": 
 cached
 will take the result of front, but only evaluate it once. This 
 is a good
 idea in and out of itself, and should take the place of 
 ".array()" in
 UFCS chains.

 Yah, cached() (better cache()?) should be nice. It may also 
 offer lookahead, e.g. cache(5) would offer a non-standard 
 lookahead(size_t n) up to 5 elements ahead.

Hum... That'd be a whole different ballpark in terms of power, as 
opposed to the simple minded cached I had in mind.

But I think both can coexist anyway, so I see no problem with 
adding extra functionality.

 From there, "tee", is nothing more than "calls funs on the 
 front
 element every time front is called, then returns front".

 From there, users can user either of:

 MyRange.tee!foo(): This calls foo on every front element, and 
 several
 times is front gets called several times.
 MyRange.tee!foo().cached(): This calls foo on every front 
 element, but
 only once, and guaranteed at least once, if it gets iterated.

 I kinda dislike that tee() is hardly useful without cache.


 Andrei

I disagree. One thing a user could expect out of tee is to print 
on every access, just to see "which elements get pushed down the 
pipe, and in which order", as opposed to "just print my range". 
In particular, I don't see why tee would not mix with random 
access.

For example, with this program:

     auto r = [4, 3, 2, 1].tee!writeln();
     writeln("first sort (not sorted)");
     r.sort();
     writeln("second sort (already sorted)");
     r.sort();

I can see the output as:

first sort (not sorted)
2
1
1
2
1
3
1
1
3
2
2
1
2
4
1
1
4
2
2
1
3
3
2
3
2
1
3
2
4
3
second sort (already sorted)
3
4
3
2
3
2
1
2
1
2
1
3
2
4
3

which gives me a good idea of how costly the sort algorithm is.

It's a good way to find out if cache(d) or array should be 
inserted in my chain.

Jun 02 2013

"irritate" <irritate gmail.com> writes:

On Monday, 3 June 2013 at 06:58:00 UTC, monarch_dodra wrote:
 I disagree. One thing a user could expect out of tee is to

Hello.

I have created a pull request for tee: 
https://github.com/D-Programming-Language/phobos/pull/1348

(I did this based on discussion in Issue 9882, and was not aware 
of this present forum thread until monarch_dodra asked me to 
present my approach here).

My approach is to create an InputRange wrapper called TeeRange, 
which will call the user provided function with each element of 
the wrapped range during iteration.

One concept with this is that the user can pass a flag to specify 
whether the function should be called on popFront (default) or on 
front.  The following unittest from my change illustrates that 
distinction:

---
unittest
{
     // Manually stride to test different pipe behavior.
     void testRange(Range)(Range r)
     {
         const int strideLen = 3;
         int i = 0;
         typeof(Range.front) elem;
         while (!r.empty)
         {
             if (i % strideLen == 0)
             {
                 elem = r.front();
             }
             r.popFront();
             i++;
         }
     }

     string txt = "abcdefghijklmnopqrstuvwxyz";

     int popCount = 0;
     auto pipeOnPop = tee!(a => popCount++)(txt);
     testRange(pipeOnPop);
     assert(popCount == 26);

     int frontCount = 0;
     auto pipeOnFront = tee!(a => frontCount++, false)(txt);
     testRange(pipeOnFront);
     assert(frontCount == 9);
}
---

Thanks,
irritate

Jun 16 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 16 June 2013 at 13:39:35 UTC, irritate wrote:
 [SNIP]

 One concept with this is that the user can pass a flag to 
 specify whether the function should be called on popFront 
 (default) or on front.

 [SNIP]

 Thanks,
 irritate

What made you change the parameter of :
* "pipeOnPop = false" (eg call on front by default)
to
* "pipeOnFront = false" (eg call on pop by default)
?

I think pipe on front makes more sense, since you'll actually 
*see* the last value that was passed if the stream is terminated, 
eg:
[1, 2, 3, 4].tee!`writeln("processing: ", a).until!"a > 2"();

Which will output:
processing: 1
processing: 2
<end>
what about 3?

Or

processing something A...
processing something B...
core dump...
(stream was actually processing C, but we are fooled into 
investigating B...)

The *advantage* of pipeOnPop is that each element is piped at 
least once, and at most once, so that's good. However, it comes 
with pitfalls which (IMO) I think should be an explicit opt-in.

Jun 16 2013

"irritate" <irritate gmail.com> writes:

On Sunday, 16 June 2013 at 17:37:32 UTC, monarch_dodra wrote:
 What made you change the parameter of :
 * "pipeOnPop = false" (eg call on front by default)
 to
 * "pipeOnFront = false" (eg call on pop by default)
 ?

Actually the original version was pipeOnPop = true by default.  
So I didn't change the logic, I just renamed the variable to make 
the flag more clear that you would pass in pipeOnFront.yes to 
opt-in. (Also to coincide with your comment on the pull request).

 I think pipe on front makes more sense, since you'll actually 
 *see* the last value that was passed if the stream is 
 terminated, eg:
 [1, 2, 3, 4].tee!`writeln("processing: ", a).until!"a > 2"();

 Which will output:
 processing: 1
 processing: 2
 <end>
 what about 3?

. . .
 The *advantage* of pipeOnPop is that each element is piped at 
 least once, and at most once, so that's good.

I think they both have their advantages, which is why it's 
probably important to be able to control the behavior regardless 
of which one is default.  I choose pipeOnPop as the default 
because:

1)  It more closely tied in to the idea of tapping into the data 
as the wrapped range is iterated over (i.e. calling front 
multiple times won't call the function multiple times, as you 
said).
2)  I felt like I would personally use pipeOnPop more often, and 
figured the most commonly used case should not require the flag.

But I'm not especially tied to it, and could see making 
pipeOnFront default if that is preferred.

And actually, as I think about what I just wrote and also my 
previous unittest example, it almost feels like pipeOnPop gives 
you insight into the wrapped range itself, and pipeOnFront gives 
you more insight into how the range is used.  The incoming vs. 
the outgoing, as it were.  And I imagine I'd like to know more 
about what is coming in more of the time, but that's just my 
opinion.

irritate

Jun 16 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, June 17, 2013 03:41:45 irritate wrote:
 I think they both have their advantages, which is why it's
 probably important to be able to control the behavior regardless
 of which one is default.  I choose pipeOnPop as the default
 because:
 
 1)  It more closely tied in to the idea of tapping into the data
 as the wrapped range is iterated over (i.e. calling front
 multiple times won't call the function multiple times, as you
 said).
 2)  I felt like I would personally use pipeOnPop more often, and
 figured the most commonly used case should not require the flag.
 
 But I'm not especially tied to it, and could see making
 pipeOnFront default if that is preferred.
 
 And actually, as I think about what I just wrote and also my
 previous unittest example, it almost feels like pipeOnPop gives
 you insight into the wrapped range itself, and pipeOnFront gives
 you more insight into how the range is used.  The incoming vs.
 the outgoing, as it were.  And I imagine I'd like to know more
 about what is coming in more of the time, but that's just my
 opinion.

I would think that pipeOnPop would be better by default simply because it's 
the more efficient thing to do, especially if it's not clear which the 
programmer is more likely to want in the general case.

- Jonathan M Davis

Jun 16 2013

"monarch_dodra" <monarchdodra gmail.com> writes:

I have implemented and submitted "cache".

Please review/destroy :)

https://github.com/D-Programming-Language/phobos/pull/1364

Jun 22 2013

"deadalnix" <deadalnix gmail.com> writes:

On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote:
 On 6/2/13 1:58 AM, Meta wrote:
 For reference type ranges and input ranges which are not 
 forward
 ranges, this
 will consume the range and return nothing.

 I originally wrote it to accept forward ranges and use save, 
 but I
 wanted to make it as inclusive as possible. I guess I 
 overlooked the
 case of ref ranges.

 [snip]

 Thanks for sharing your ideas.

 I think consuming all of a range evaluating front and doing 
 nothing should be the role of reduce with only one parameter 
 (the range). That overload would take the range to be 
 "exhausted" and return void. Thus your example becomes:

 [1, 2, 3, 4].map!(n => n.writeln).reduce;


 Andrei

map being lazy, this can really do all kind of different stuff.

Jun 02 2013

"Brad Anderson" <eco gnuk.net> writes:

On Sunday, 2 June 2013 at 02:57:56 UTC, Meta wrote:
 I saw a thread a few days ago about somebody wanting a few 
 UFCS-based convenience functions, so I thought that I'd take 
 the opportunity to make a small contribution to Phobos. 
 Currently I have four small functions: each, exhaust, perform, 
 and tap, and would like some feedback.

 each is designed to perform operations with side-effects on 
 each range element. To actually change the elements of the 
 range, each element must be accepted by reference.

 Range each(alias fun, Range)(Range r)
 if (isInputRange!(Unqual!Range))
 {
     alias unaryFun!fun _fun;
     foreach (ref e; r)
     {
         fun(e);
     }

     return r;
 }

 //Prints [-1, 0, 1]
 [1, 2, 3].each!((ref i) => i -= 2).writeln;


 exhaust iterates a range until it is exhausted. It also has the 
 nice feature that if range.front is callable, exhaust will call 
 it upon each iteration.

 Range exhaust(Range)(Range r)
 if (isInputRange!(Unqual!Range))
 {

     while (!r.empty)
     {
         r.front();
         r.popFront();
     }

     return r;
 }

 //Writes "www.dlang.org". x is an empty MapResult range.
 auto x = "www.dlang.org"
          .map!((c) { c.write; return false; })
          .exhaust;

 //Prints []
 [1, 2, 3].exhaust.writeln;


 perform is pretty badly named, but I couldn't come up with a 
 better one. It can be inserted in a UFCS chain and perform some 
 operation with side-effects. It doesn't alter its argument, 
 just returns it for the next function in the chain.

 T perform(alias dg, T)(ref T val)
 {
     dg();

     return val;
 }

 //Prints "Mapped: 2 4"
 [1, 2, 3, 4, 5]
 .filter!(n => n < 3)
 .map!(n => n * n)
 .perform!({write("Mapped: ");})
 .each!(n => write(n, " "));


 Lastly is tap, which takes a value and performs some mutating 
 operation on it. It then returns the value.

 T tap(alias dg, T)(auto ref T val)
 {
     dg(val);

     return val;
 }

 class Foo
 {
     int x;
     int y;
 }

 auto f = (new Foo).tap!((f)
 {
     f.x = 2;
     f.y = 3;
 });

 //Prints 2 3
 writeln(f.x, " ", f.y);

 struct Foo2
 {
     int x;
     int y;
 }

 //Need to use ref for value types
 auto f2 = Foo2().tap!((ref f)
 {
     f.x = 3;
     f.y = 2;
 });

 //Prints 3 2
 writeln(f2.x, " ", f2.y);


 Do you think these small functions have a place in Phobos? I 
 think each and exhaust would be best put into std.range, but 
 I'm not quite sure where perform and tap should go. Also, 
 there's that horrible name for perform, for which I would like 
 to come up with a better name.

You may find this forum discussion from several months ago 
interesting.

http://forum.dlang.org/post/kglo9d$rjf$1 digitalmars.com

Confusingly, your each() seems to be fairly similar to what 
Andrei wanted tap() used for.  Andrei didn't care for the tap() 
you propose but loved the idea of a tap() function that works 
like unix tee.

I like exhaust() as I just had to write something similar.  I 
like perform() just because I love UFCS range chains and anything 
to avoid those extra statements is alright in my book.  This is 
probably not a majority opinion though.  I can't think of a 
better name either though.

Jun 01 2013

"bearophile" <bearophileHUGS lycos.com> writes:

Meta:

 perform is pretty badly named, but I couldn't come up with a 
 better one. It can be inserted in a UFCS chain and perform some 
 operation with side-effects. It doesn't alter its argument, 
 just returns it for the next function in the chain.

 T perform(alias dg, T)(ref T val)
 {
     dg();

     return val;
 }

 //Prints "Mapped: 2 4"
 [1, 2, 3, 4, 5]
 .filter!(n => n < 3)
 .map!(n => n * n)
 .perform!({write("Mapped: ");})
 .each!(n => write(n, " "));

I'd like something like this in Phobos, but I'd like it to have a 
better name. But in most (all?) cases what I want to put inside 
such perform is a printing function, so I have opened this:

http://d.puremagic.com/issues/show_bug.cgi?id=9882



 exhaust iterates a range until it is exhausted. It also has the 
 nice feature that if range.front is callable, exhaust will call 
 it upon each iteration.

 Range exhaust(Range)(Range r)
 if (isInputRange!(Unqual!Range))
 {

     while (!r.empty)
     {
         r.front();
         r.popFront();
     }

     return r;
 }

 //Writes "www.dlang.org". x is an empty MapResult range.
 auto x = "www.dlang.org"
          .map!((c) { c.write; return false; })
          .exhaust;

 //Prints []
 [1, 2, 3].exhaust.writeln;

I's also like this in Phobos, for debugging purposes. But I'd 
like it to return nothing, so you are forced to use it only at 
the end of a chain.

(So I appreciate 2 of your 4 proposals. I have proposed both of 
them in D.learn time ago.)

---------------------

Brad Anderson:

 Andrei didn't care for the tap() you propose but loved the idea 
 of a tap() function that works like unix tee.

Something like this Python itertool?

def tee(iterable, n=2):
     it = iter(iterable)
     deques = [collections.deque() for i in range(n)]
     def gen(mydeque):
         while True:

empty


deques
                     d.append(newval)
             yield mydeque.popleft()
     return tuple(gen(d) for d in deques)


Bye,
bearophile

Jun 02 2013

D Programming

C/C++ Programming

Other

digitalmars.D - A Small Contribution to Phobos