www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - popFrontExactly?

reply "monarch_dodra" <monarchdodra gmail.com> writes:
How does the community feel about having a "popFrontExactly" in 
phobos?

"popFrontN" is like "take": Safe, but more costly: It double 
checks there actually *are* "n" elements in the range, and if 
not, stops pop-ing.

When using popFrontN, more often than not, the "n" value is 
extracted from iterating the range already, so we *know* there 
are "n" elements in the range. In that case, popFrontN is 
un-necesarilly costly.

In that case, we could have "popFrontExactly", which would 
operate like "takeExactly": It would pre-assume that the *are* n 
elements, and assert otherwise.

--------
I know I've had the usecase for this in the past. Performance and 
convenience gains are there for both forward and sliceable ranges.

Should I push a pull request adding such a function (and its 
friends) to phobos? How does the group feel about such functions?
Nov 18 2012
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, November 18, 2012 21:34:37 monarch_dodra wrote:
 Should I push a pull request adding such a function (and its
 friends) to phobos? How does the group feel about such functions?

In principle, it's a good idea, but I also worry about a proliferation of such functions. Personally, I think that I would have argued for popFrontN being popFrontExactly in the first place, but it's obviously too late for that. In probably all cases that I use popFrontN, what I'd want would be popFrontExactly (though popFrontNExactly would probably be a better name given its connection to popFrontN). So, you probably should create a pull request for it, but then you need popBackNExactly too, and possibly drop(Front)Exactly and dropBackExactly, and that's where things get a bit ugly. But given the concerns for efficiency and the prevalence of functions like popFrontN in range based code, popFrontNExactly would make good sense. - Jonathan M Davis
Nov 22 2012
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/23/2012 9:19 AM, Jonathan M Davis пишет:
 On Friday, November 23, 2012 03:55:21 Kapps wrote:
 Is it really that big an issue to have a few more methods than
 standard ranges would need? Before it certainly would be annoying
 to have to check if the range supported it, use that, and if not
 fake it, but now we have UFCS. It would be simple to have a basic
 fallback implementation of methods such as popFrontExactly, then
 simply use range.popFrontExactly to get the more performant one
 if the range supports it, and if not get the fallback.

 It would have to be clear in the documentation that these are
 optional methods however, and are recommended only if performance
 is required (and if the range supports it in a way that isn't
 simply an alternate implementation of the fallback).

You misunderstood. popFrontExactly/popFrontNExactly wouldn't be on any ranges any more than popFrontN or drop are on any ranges. They're free functions in std.range which either slice a range or call its popFront in a loop (depending on the type of range) in order to pop the appropriate number of elements off, and popFrontNExactly would be the same. It may very well be that we should add popFrontNExactly in order to get that extra efficiency gain over popFrontN in the cases where you know that the range contains at least the number of elements being popped.

This gets interesting - how did you know it? The number of elements to pop I mean. I'd say the cases where hasLength is true and there is no slicing is quite rare. It'd be interesting to know what are these cases that it this set of helpers tries to speed up. I mean a list of: -algorithms where popFrontN is used -ranges that allow hasLength but not slicing and work with the said algorithm And I agree with Jonathan that adding a bunch of helpers should be justified. Any helper function like say 'enforce' has utility only as long as simplifies a usage of a _frequent_ enough pattern. If it simplifies/speed up certain algorithms there is no guilt in just using them internally. -- Dmitry Olshansky
Nov 23 2012
prev sibling next sibling parent "Kapps" <opantm2+spam gmail.com> writes:
On Thursday, 22 November 2012 at 13:08:56 UTC, Jonathan M Davis 
wrote:
 On Sunday, November 18, 2012 21:34:37 monarch_dodra wrote:
 Should I push a pull request adding such a function (and its
 friends) to phobos? How does the group feel about such 
 functions?

In principle, it's a good idea, but I also worry about a proliferation of such functions. Personally, I think that I would have argued for popFrontN being popFrontExactly in the first place, but it's obviously too late for that. In probably all cases that I use popFrontN, what I'd want would be popFrontExactly (though popFrontNExactly would probably be a better name given its connection to popFrontN). So, you probably should create a pull request for it, but then you need popBackNExactly too, and possibly drop(Front)Exactly and dropBackExactly, and that's where things get a bit ugly. But given the concerns for efficiency and the prevalence of functions like popFrontN in range based code, popFrontNExactly would make good sense. - Jonathan M Davis

Is it really that big an issue to have a few more methods than standard ranges would need? Before it certainly would be annoying to have to check if the range supported it, use that, and if not fake it, but now we have UFCS. It would be simple to have a basic fallback implementation of methods such as popFrontExactly, then simply use range.popFrontExactly to get the more performant one if the range supports it, and if not get the fallback. It would have to be clear in the documentation that these are optional methods however, and are recommended only if performance is required (and if the range supports it in a way that isn't simply an alternate implementation of the fallback).
Nov 22 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, November 23, 2012 03:55:21 Kapps wrote:
 Is it really that big an issue to have a few more methods than
 standard ranges would need? Before it certainly would be annoying
 to have to check if the range supported it, use that, and if not
 fake it, but now we have UFCS. It would be simple to have a basic
 fallback implementation of methods such as popFrontExactly, then
 simply use range.popFrontExactly to get the more performant one
 if the range supports it, and if not get the fallback.
 
 It would have to be clear in the documentation that these are
 optional methods however, and are recommended only if performance
 is required (and if the range supports it in a way that isn't
 simply an alternate implementation of the fallback).

You misunderstood. popFrontExactly/popFrontNExactly wouldn't be on any ranges any more than popFrontN or drop are on any ranges. They're free functions in std.range which either slice a range or call its popFront in a loop (depending on the type of range) in order to pop the appropriate number of elements off, and popFrontNExactly would be the same. It may very well be that we should add popFrontNExactly in order to get that extra efficiency gain over popFrontN in the cases where you know that the range contains at least the number of elements being popped. It's just that it's getting a bit ridiculous how many of these sorts of methods we have. If you add popFrontNExactly, you need popBackNExactly and probably drop(Front)Exactly and dropBackExactly as well. Any time that another function like this gets suggested, you end up having to add a number of similar functions, and it adds up. Its arguably a bit ridiculous to have so many functions that do almost the same thing but not quite. I think that we'll probably need to add them in this case due to efficiency concerns, but it's not without cost. We don't want too many stray functions doing almost the same thing in Phobos. - Jonathan M Davis
Nov 22 2012
prev sibling next sibling parent Tobias Pankrath <lists pankrath.net> writes:
 I think that we'll probably need to add them in this case due to efficiency
 concerns, but it's not without cost. We don't want too many stray functions
 doing almost the same thing in Phobos.

 - Jonathan M Davis

Better to have them in phobos instead of every ones own little helper library. And I would call in popFrontExactly, not popFrontNExactly. The latter is just hard to read and unnecessarily so.
Nov 23 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 23 November 2012 at 08:42:18 UTC, Jonathan M Davis 
wrote:
 I think that we'll probably need to add them in this case due 
 to efficiency
 concerns, but it's not without cost. We don't want too many 
 stray functions
 doing almost the same thing in Phobos.

 - Jonathan M Davis

I'd say it's all a matter of presentation. I'm sure they can all easily be regrouped into what feels like "a neat family of functions". What counts is the perceived complexity, more than the actual amount of function names. They aren't quite overloads, but it doesn't make them much more complex than, say, the "to" functions.
Nov 23 2012
prev sibling next sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 23 November 2012 at 15:48:04 UTC, Dmitry Olshansky 
wrote:
 11/23/2012 9:19 AM, Jonathan M Davis пишет:
 On Friday, November 23, 2012 03:55:21 Kapps wrote:
 Is it really that big an issue to have a few more methods than
 standard ranges would need? Before it certainly would be 
 annoying
 to have to check if the range supported it, use that, and if 
 not
 fake it, but now we have UFCS. It would be simple to have a 
 basic
 fallback implementation of methods such as popFrontExactly, 
 then
 simply use range.popFrontExactly to get the more performant 
 one
 if the range supports it, and if not get the fallback.

 It would have to be clear in the documentation that these are
 optional methods however, and are recommended only if 
 performance
 is required (and if the range supports it in a way that isn't
 simply an alternate implementation of the fallback).

You misunderstood. popFrontExactly/popFrontNExactly wouldn't be on any ranges any more than popFrontN or drop are on any ranges. They're free functions in std.range which either slice a range or call its popFront in a loop (depending on the type of range) in order to pop the appropriate number of elements off, and popFrontNExactly would be the same. It may very well be that we should add popFrontNExactly in order to get that extra efficiency gain over popFrontN in the cases where you know that the range contains at least the number of elements being popped.

This gets interesting - how did you know it? The number of elements to pop I mean.

Usually, either by a previous iteration, or just because it is statically know the range will have that many elements. Or that it *should* have that many elements. You could ask the question the other way around: Why would you pop a certain amount of elements, if you don't even know your range actually *holds* that many elements? Why do you even need the safeguard in the first place?
 I'd say the cases where hasLength is true and there is no 
 slicing is quite rare. It'd be interesting to know what are 
 these cases that it this set of helpers tries to speed up. I 
 mean a list of:
 -algorithms where popFrontN is used
 -ranges that allow hasLength but not slicing and work with the 
 said algorithm

What about the case of plain bidirectional ranges? That's the one that's being sped up. Still regardless of performance, the (my) motivating factor is that when you use "drop", it pretty much silently fails if your range doesn't have the amount of elements. IMO, in the long run, this makes "drop" *un*-safe...
Nov 23 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Friday, November 23, 2012 17:01:04 monarch_dodra wrote:
 You could ask the question the other way around: Why would you
 pop a certain amount of elements, if you don't even know your
 range actually *holds* that many elements? Why do you even need
 the safeguard in the first place?

It would be easy to have a file or message format where the value of one sequence of bytes tells you how many are left or how many are in the next section. If you were going to pop those bytes without looking at them, then such a check would be necessary if you don't actually know how many bytes are left in the range (e.g. you're dealing with a simple, input range). That being said, I've never written code which used popFrontN when I didn't know how many elements were going to be popped off. If I use popFrontN, either I know the range's length, or I know that it has at least as many elements as I'm trying to pop off (because I used take on it or otherwise iterated over a part of the range). If I don't know how many elements there are to pop off, then I'm going to be popping them off one by one and checking empty - though if I relaly don't need to actually look at each element, it would arguably be better to just use popFrontN. In most cases though, if I don't know the number of elements left, then I'm probably looking at them. In any case, there _are_ cases where you could legitimately call popFrontN without knowing if there are at least n elements to pop off, but I never use it that way.
 Still regardless of performance, the (my) motivating factor is
 that when you use "drop", it pretty much silently fails if your
 range doesn't have the amount of elements. IMO, in the long run,
 this makes "drop" *un*-safe...

I don't know if we can get away with it or not (given the possibility of breaking code), but given that drop doesn't return how many elements were popped (unlike popFrontN), I'd favor simply making it drop the exact number of elements. Then if you wanted popFrontExactly, you could just do range = drop(range, n); It's slightly less efficient (assuming that the optimizer doesn't help you out), but it avoids having to create a new function. It was already one of Andrei's complaints about drop when it was introduced that there was no way to determine how many elements were actually dropped, so he might favor the idea of making drop assert that it drops the correct number of elements rather than using popFrontN. The one concern would be whether making that change would break any code. Given that you can't know how many elements were actually dropped and the fact that I suspect most of us don't call popFrontN unless we know that there were at least that many elements to drop, the amount of code broken might be negligle if not outright 0. But we don't want to break code, so that may just not be an option. Still, if we can't do that, and we're trying to minimize the number of stray functions in std.range, then we could just create drop(Front)Exactly and dropBackExactly and let people do range = dropExactly(range, 5); if they want popFrontExactly. - Jonathan M Davis
Nov 23 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
I wouldn't mind also adding a popFrontApproximately()

=P
Nov 23 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, November 24, 2012 07:43:26 Mehrdad wrote:
 I wouldn't mind also adding a popFrontApproximately()
 
 =P

So, what's it do? Randomly pick a number close to the number that you give it and try to pop that many? :) - Jonathan M Davis
Nov 23 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Saturday, 24 November 2012 at 06:50:43 UTC, Jonathan M Davis 
wrote:
 On Saturday, November 24, 2012 07:43:26 Mehrdad wrote:
 I wouldn't mind also adding a popFrontApproximately()
 
 =P

So, what's it do? Randomly pick a number close to the number that you give it and try to pop that many? :) - Jonathan M Davis

That's an implementation detail.
Nov 23 2012
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Friday, 23 November 2012 at 21:18:57 UTC, Jonathan M Davis 
wrote:
 Still, if we can't do that, and we're trying to minimize the 
 number of stray
 functions in std.range, then we could just create 
 drop(Front)Exactly and
 dropBackExactly and let people do

 range = dropExactly(range, 5);

 if they want popFrontExactly.

 - Jonathan M Davis

Well, if the goal was reducing the amount of functions, or avoiding hard to grasp variants, then I'd agree. But in this case, the goal (IMO) is to provide convenient and intuitive tools for coding. I think (but this is my point of view), that while there are several different names and variants, it all remains in a simple and comprehensive package. The function names alone are enough to know what it does, and it's not like there are any "traps", or things where you'd need to read documentation 10 minutes to understand *why* there are different functions. I don't see it as a problem to have all these functions, but that's my point of view.
Nov 26 2012