www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Filter and Map's weird return types cause frustration...

reply "Allen Nelson" <ithinkican gmail.com> writes:
I'm trying to do some functional programming techniques on lists 
in D, and while I'm happy that there's some functionality to be 
found, some of (what I perceive as) the language's foibles don't 
make it as smooth as I'd like. Here's an example:

import std.stdio, std.algorithm;

void main() {
	string[][] example = [["a", "b", "sdflk"],
					   ["sdflkjsd", "sdl"],
					   ["sdfjhsdklfjh", "sdkjfh", "alskaslk"]];
	writeln(example);
	int i=2;
	bool isLongEnough (string[] slist) {return slist.length > i;}
	example = filter!(isLongEnough)(example);
	writeln(example);
}

It seems a perfectly valid desire, to have an array and want to 
prune it down into an array that satisfies certain requirements. 
After all, this is exactly what filter is supposed to do. This 
code seems fairly straightforward then, but trying to compile it:

 dmd test.d
test.d(10): Error: cannot implicitly convert expression (filter(reps)) of type FilterResult!(isLongEnough, string[][]) to string[][] So essentially, the compiler is telling me that even though filter's whole raison d'etre is to take a data structure and return the same kind of data structure, but with some values possibly removed, it not only returns a weird in-house-brand "FilterResult" type, it can't even convert this into a into a string[][] type. Now, I could use "auto" instead of "string[][]", and it works, but I have no use for a "FilterResult". I don't know what methods I can call on it, or what it's made out of. I want a string[][]. A similar story happens with the map function. Map takes an array and a function and returns another array which is of values computed from applying the function to the input array. One would think, then that this code would work: void main() { string[][] reps = [["a", "b", "sdflk"], ["sdflkjsd", "sdl"], ["sdfjhsdklfjh", "sdkjfh", "alskaslk"]]; writeln(reps); int[] lengths = map!("a.length")(reps); writeln(lengths); } After all, the .length function returns an int, and we have an array of them, so map returns an int, right? Wrong! dmd test.d test.d(8): Error: cannot implicitly convert expression (map(reps)) of type MapResult!(unaryFun, string[][]) to int[] Once again, even though the returned array is, for all intents and purposes, an int[], I can't call it that. Which means I can't concatenate onto an existing int[], for example. Why are filter and map returning these weird one-off types in the first place? Why would filter return anything but the same type of list that it was passed as an argument? Why would map return anything other than an array of whatever type matches the return value of the operator? And if there's some good reason why these things are the case, why then at the very least would you not be able to easily cast the return type into the return type that you want?
Feb 23 2013
next sibling parent reply "Chris Nicholson-Sauls" <ibisbasenji gmail.com> writes:
Straight from the documentation: "returns a new range" -- no 
guarantee made about the relation of return type and parameter 
type.

These functions return range objects to amortize the potential 
cost of the operation, with a very small initial cost (a 
reference to the input range), which is actually very useful in a 
great many situations. They are all written as generics operating 
on ranges as both input and output.  For better or worse, this 
has become the de facto D idiom.  If you really do just want the 
array out of it, import std.array and call the eponymous function:

example = example.filter!isLongEnough().array();

Voila. This iterates the full range and collects the results, as 
expected. What *would* be nice would be to have "InPlace" 
variations of these functions for use cases such as yours, in 
order to re-use resources.

example.filterInPlace!isLongEnough();

Bam, done; assuming the input is a random access range (which a 
basic array is, is it not?).
Feb 23 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Chris Nicholson-Sauls:

 What *would* be nice would be to have "InPlace" variations of 
 these functions for use cases such as yours, in order to re-use 
 resources.

 example.filterInPlace!isLongEnough();
Sometimes I'd like some in place operations in Phobos. To remove duplicate items from a mutable array of sortable items: data.length -= data.sort().uniq().copy(data).length; or to just remove the item "5" from an array of ints, it's not hard to miss the necessary self-assignment: data = data.delete(5); Bye, bearophile
Feb 23 2013
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 2/23/13 10:55 AM, Chris Nicholson-Sauls wrote:
 Straight from the documentation: "returns a new range" -- no guarantee
 made about the relation of return type and parameter type.

 These functions return range objects to amortize the potential cost of
 the operation, with a very small initial cost (a reference to the input
 range), which is actually very useful in a great many situations. They
 are all written as generics operating on ranges as both input and
 output. For better or worse, this has become the de facto D idiom. If
 you really do just want the array out of it, import std.array and call
 the eponymous function:

 example = example.filter!isLongEnough().array();

 Voila. This iterates the full range and collects the results, as
 expected. What *would* be nice would be to have "InPlace" variations of
 these functions for use cases such as yours, in order to re-use resources.

 example.filterInPlace!isLongEnough();

 Bam, done; assuming the input is a random access range (which a basic
 array is, is it not?).
example = example.remove!isTooShort; Andrei
Feb 23 2013
prev sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Saturday, 23 February 2013 at 08:48:25 UTC, Allen Nelson wrote:
 Why are filter and map returning these weird one-off types in 
 the first place? Why would filter return anything but the same 
 type of list that it was passed as an argument? Why would map 
 return anything other than an array of whatever type matches 
 the return value of the operator? And if there's some good 
 reason why these things are the case, why then at the very 
 least would you not be able to easily cast the return type into 
 the return type that you want?
If you are using an array, you can construct an array from the result using std.array.array int[] evens = [1, 2, 3, 4, 5].map!(x => 2*x)().array(); The reason map, filter, and most other range functions don't return the same type of array is because they are lazy. For example: int[] twoFourSix = [1, 2, 3, 4, 5].map!(x => 2*x)().take(3).array(); Here, the mapping function is only called on the first three elements of the array. If map returned the array [2, 4, 6, 8, 10] then it would be doing more work than necessary since the take only cares about the first 3 elements. Also consider infinite ranges: auto twoFourSixAdInfinitum = cycle([1, 2, 3]).map!(x => 2*x)(); What should map return here? You cannot have infinitely sized arrays! There's no way to have map return the same range as the input, or even just an array, while also being lazy. If you want an array, use std.array.array(). Haskell manages to get around this by adding an extra layer of indirection around every operation, which hurts performance. D's standard library is designed to be high performance by default, which unfortunately does hurt expressiveness a little bit.
Feb 23 2013
parent reply notna <notna.remove.this ist-einmalig.de> writes:
Thanks for this very good explanations and examples. This "Range" topic 
is a constant confusion, see one of many examples:

http://forum.dlang.org/thread/rynfksadckwinwaurtiy forum.dlang.org

I don't understand why D cannot be simple AND efficient.
I think every function, which returns a kind of a "Range", should also 
always be able to return "char", "char[]", "char[][]", "string" and 
"string[]" per default, depending on what is expected.


On 23.02.2013 11:14, Peter Alexander wrote:
 On Saturday, 23 February 2013 at 08:48:25 UTC, Allen Nelson wrote:
 Why are filter and map returning these weird one-off types in the
 first place? Why would filter return anything but the same type of
 list that it was passed as an argument? Why would map return anything
 other than an array of whatever type matches the return value of the
 operator? And if there's some good reason why these things are the
 case, why then at the very least would you not be able to easily cast
 the return type into the return type that you want?
If you are using an array, you can construct an array from the result using std.array.array int[] evens = [1, 2, 3, 4, 5].map!(x => 2*x)().array(); The reason map, filter, and most other range functions don't return the same type of array is because they are lazy. For example: int[] twoFourSix = [1, 2, 3, 4, 5].map!(x => 2*x)().take(3).array(); Here, the mapping function is only called on the first three elements of the array. If map returned the array [2, 4, 6, 8, 10] then it would be doing more work than necessary since the take only cares about the first 3 elements. Also consider infinite ranges: auto twoFourSixAdInfinitum = cycle([1, 2, 3]).map!(x => 2*x)(); What should map return here? You cannot have infinitely sized arrays! There's no way to have map return the same range as the input, or even just an array, while also being lazy. If you want an array, use std.array.array(). Haskell manages to get around this by adding an extra layer of indirection around every operation, which hurts performance. D's standard library is designed to be high performance by default, which unfortunately does hurt expressiveness a little bit.
Feb 23 2013
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Saturday, 23 February 2013 at 22:26:51 UTC, notna wrote:
 Thanks for this very good explanations and examples. This 
 "Range" topic is a constant confusion, see one of many examples:

 http://forum.dlang.org/thread/rynfksadckwinwaurtiy forum.dlang.org

 I don't understand why D cannot be simple AND efficient.
 I think every function, which returns a kind of a "Range", 
 should also always be able to return "char", "char[]", 
 "char[][]", "string" and "string[]" per default, depending on 
 what is expected.
Being lazy is often critical for performance. Returning a normal type automatically implies eagerness.
Feb 23 2013