digitalmars.D - Filter and Map's weird return types cause frustration...
- Allen Nelson (59/60) Feb 23 2013 I'm trying to do some functional programming techniques on lists
- Chris Nicholson-Sauls (18/18) Feb 23 2013 Straight from the documentation: "returns a new range" -- no
- bearophile (9/13) Feb 23 2013 Sometimes I'd like some in place operations in Phobos.
- Andrei Alexandrescu (3/19) Feb 23 2013 example = example.remove!isTooShort;
- Peter Alexander (24/32) Feb 23 2013 If you are using an array, you can construct an array from the
- notna (8/36) Feb 23 2013 Thanks for this very good explanations and examples. This "Range" topic
- John Colvin (3/11) Feb 23 2013 Being lazy is often critical for performance. Returning a normal
I'm trying to do some functional programming techniques on lists in D, and while I'm happy that there's some functionality to be found, some of (what I perceive as) the language's foibles don't make it as smooth as I'd like. Here's an example: import std.stdio, std.algorithm; void main() { string[][] example = [["a", "b", "sdflk"], ["sdflkjsd", "sdl"], ["sdfjhsdklfjh", "sdkjfh", "alskaslk"]]; writeln(example); int i=2; bool isLongEnough (string[] slist) {return slist.length > i;} example = filter!(isLongEnough)(example); writeln(example); } It seems a perfectly valid desire, to have an array and want to prune it down into an array that satisfies certain requirements. After all, this is exactly what filter is supposed to do. This code seems fairly straightforward then, but trying to compile it:dmd test.dtest.d(10): Error: cannot implicitly convert expression (filter(reps)) of type FilterResult!(isLongEnough, string[][]) to string[][] So essentially, the compiler is telling me that even though filter's whole raison d'etre is to take a data structure and return the same kind of data structure, but with some values possibly removed, it not only returns a weird in-house-brand "FilterResult" type, it can't even convert this into a into a string[][] type. Now, I could use "auto" instead of "string[][]", and it works, but I have no use for a "FilterResult". I don't know what methods I can call on it, or what it's made out of. I want a string[][]. A similar story happens with the map function. Map takes an array and a function and returns another array which is of values computed from applying the function to the input array. One would think, then that this code would work: void main() { string[][] reps = [["a", "b", "sdflk"], ["sdflkjsd", "sdl"], ["sdfjhsdklfjh", "sdkjfh", "alskaslk"]]; writeln(reps); int[] lengths = map!("a.length")(reps); writeln(lengths); } After all, the .length function returns an int, and we have an array of them, so map returns an int, right? Wrong! dmd test.d test.d(8): Error: cannot implicitly convert expression (map(reps)) of type MapResult!(unaryFun, string[][]) to int[] Once again, even though the returned array is, for all intents and purposes, an int[], I can't call it that. Which means I can't concatenate onto an existing int[], for example. Why are filter and map returning these weird one-off types in the first place? Why would filter return anything but the same type of list that it was passed as an argument? Why would map return anything other than an array of whatever type matches the return value of the operator? And if there's some good reason why these things are the case, why then at the very least would you not be able to easily cast the return type into the return type that you want?
Feb 23 2013
Straight from the documentation: "returns a new range" -- no guarantee made about the relation of return type and parameter type. These functions return range objects to amortize the potential cost of the operation, with a very small initial cost (a reference to the input range), which is actually very useful in a great many situations. They are all written as generics operating on ranges as both input and output. For better or worse, this has become the de facto D idiom. If you really do just want the array out of it, import std.array and call the eponymous function: example = example.filter!isLongEnough().array(); Voila. This iterates the full range and collects the results, as expected. What *would* be nice would be to have "InPlace" variations of these functions for use cases such as yours, in order to re-use resources. example.filterInPlace!isLongEnough(); Bam, done; assuming the input is a random access range (which a basic array is, is it not?).
Feb 23 2013
Chris Nicholson-Sauls:What *would* be nice would be to have "InPlace" variations of these functions for use cases such as yours, in order to re-use resources. example.filterInPlace!isLongEnough();Sometimes I'd like some in place operations in Phobos. To remove duplicate items from a mutable array of sortable items: data.length -= data.sort().uniq().copy(data).length; or to just remove the item "5" from an array of ints, it's not hard to miss the necessary self-assignment: data = data.delete(5); Bye, bearophile
Feb 23 2013
On 2/23/13 10:55 AM, Chris Nicholson-Sauls wrote:Straight from the documentation: "returns a new range" -- no guarantee made about the relation of return type and parameter type. These functions return range objects to amortize the potential cost of the operation, with a very small initial cost (a reference to the input range), which is actually very useful in a great many situations. They are all written as generics operating on ranges as both input and output. For better or worse, this has become the de facto D idiom. If you really do just want the array out of it, import std.array and call the eponymous function: example = example.filter!isLongEnough().array(); Voila. This iterates the full range and collects the results, as expected. What *would* be nice would be to have "InPlace" variations of these functions for use cases such as yours, in order to re-use resources. example.filterInPlace!isLongEnough(); Bam, done; assuming the input is a random access range (which a basic array is, is it not?).example = example.remove!isTooShort; Andrei
Feb 23 2013
On Saturday, 23 February 2013 at 08:48:25 UTC, Allen Nelson wrote:Why are filter and map returning these weird one-off types in the first place? Why would filter return anything but the same type of list that it was passed as an argument? Why would map return anything other than an array of whatever type matches the return value of the operator? And if there's some good reason why these things are the case, why then at the very least would you not be able to easily cast the return type into the return type that you want?If you are using an array, you can construct an array from the result using std.array.array int[] evens = [1, 2, 3, 4, 5].map!(x => 2*x)().array(); The reason map, filter, and most other range functions don't return the same type of array is because they are lazy. For example: int[] twoFourSix = [1, 2, 3, 4, 5].map!(x => 2*x)().take(3).array(); Here, the mapping function is only called on the first three elements of the array. If map returned the array [2, 4, 6, 8, 10] then it would be doing more work than necessary since the take only cares about the first 3 elements. Also consider infinite ranges: auto twoFourSixAdInfinitum = cycle([1, 2, 3]).map!(x => 2*x)(); What should map return here? You cannot have infinitely sized arrays! There's no way to have map return the same range as the input, or even just an array, while also being lazy. If you want an array, use std.array.array(). Haskell manages to get around this by adding an extra layer of indirection around every operation, which hurts performance. D's standard library is designed to be high performance by default, which unfortunately does hurt expressiveness a little bit.
Feb 23 2013
Thanks for this very good explanations and examples. This "Range" topic is a constant confusion, see one of many examples: http://forum.dlang.org/thread/rynfksadckwinwaurtiy forum.dlang.org I don't understand why D cannot be simple AND efficient. I think every function, which returns a kind of a "Range", should also always be able to return "char", "char[]", "char[][]", "string" and "string[]" per default, depending on what is expected. On 23.02.2013 11:14, Peter Alexander wrote:On Saturday, 23 February 2013 at 08:48:25 UTC, Allen Nelson wrote:Why are filter and map returning these weird one-off types in the first place? Why would filter return anything but the same type of list that it was passed as an argument? Why would map return anything other than an array of whatever type matches the return value of the operator? And if there's some good reason why these things are the case, why then at the very least would you not be able to easily cast the return type into the return type that you want?If you are using an array, you can construct an array from the result using std.array.array int[] evens = [1, 2, 3, 4, 5].map!(x => 2*x)().array(); The reason map, filter, and most other range functions don't return the same type of array is because they are lazy. For example: int[] twoFourSix = [1, 2, 3, 4, 5].map!(x => 2*x)().take(3).array(); Here, the mapping function is only called on the first three elements of the array. If map returned the array [2, 4, 6, 8, 10] then it would be doing more work than necessary since the take only cares about the first 3 elements. Also consider infinite ranges: auto twoFourSixAdInfinitum = cycle([1, 2, 3]).map!(x => 2*x)(); What should map return here? You cannot have infinitely sized arrays! There's no way to have map return the same range as the input, or even just an array, while also being lazy. If you want an array, use std.array.array(). Haskell manages to get around this by adding an extra layer of indirection around every operation, which hurts performance. D's standard library is designed to be high performance by default, which unfortunately does hurt expressiveness a little bit.
Feb 23 2013
On Saturday, 23 February 2013 at 22:26:51 UTC, notna wrote:Thanks for this very good explanations and examples. This "Range" topic is a constant confusion, see one of many examples: http://forum.dlang.org/thread/rynfksadckwinwaurtiy forum.dlang.org I don't understand why D cannot be simple AND efficient. I think every function, which returns a kind of a "Range", should also always be able to return "char", "char[]", "char[][]", "string" and "string[]" per default, depending on what is expected.Being lazy is often critical for performance. Returning a normal type automatically implies eagerness.
Feb 23 2013