digitalmars.D.learn - In general, who should do more work: popFront or front?
- surlymoor (13/13) Jun 14 2021 All my custom range types perform all their meaningful work in
- jfondren (25/39) Jun 14 2021 Well, consider this program:
- surlymoor (13/38) Jun 14 2021 Appreciate the response.
- mw (3/17) Jun 14 2021 In English, `front` is a noun, `popFront` have a verb in it, so
- surlymoor (6/8) Jun 14 2021 Sure, but, for example, the front method of Phobos' MapResult is
- mw (25/33) Jun 14 2021 I think there is another convention (although it's not formally
- surlymoor (9/17) Jun 14 2021 That's a separate matter. To use as an example once more,
- ag0aep6g (6/26) Jun 14 2021 `const` and `inout` mean the same thing here: Both functions cannot
- =?UTF-8?Q?Ali_=c3=87ehreli?= (6/11) Jun 15 2021 In other words, front() should be "idempotent".
- Mike Parker (9/15) Jun 14 2021 The "general rule" is that `front` should return the same result
- Paul Backus (16/30) Jun 15 2021 It's a time-space tradeoff. As you say, caching requires
- H. S. Teoh (9/25) Jun 15 2021 One way to address this is to make the computation lazy: the element is
- Steven Schveighoffer (12/25) Jun 15 2021 IMO, `front` should not be figuring out which element is next. That is
All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.) At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)
Jun 14 2021
On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.) At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)Well, consider this program: ```d import std; struct Noisy { int[] source; int pops, fronts; bool empty() { return source.empty; } source.popFront; } source.front; } } void main() { iota(5).array .Noisy .filter!"a%2" .each!writeln; } ``` Out of [0,1,2,3,4], only 1,3 pass the filter. Noisy's front is called seven times, 2x for each filter success. Noisy's popFront is called five times, 1x for each source member. But if you slap a .cache (from std.algorithm.cache) before the .filter then these counts are the same.
Jun 14 2021
On Tuesday, 15 June 2021 at 04:43:38 UTC, jfondren wrote:Well, consider this program: ```d import std; struct Noisy { int[] source; int pops, fronts; bool empty() { return source.empty; } source.popFront; } source.front; } } void main() { iota(5).array .Noisy .filter!"a%2" .each!writeln; } ``` Out of [0,1,2,3,4], only 1,3 pass the filter. Noisy's front is called seven times, 2x for each filter success. Noisy's popFront is called five times, 1x for each source member. But if you slap a .cache (from std.algorithm.cache) before the .filter then these counts are the same.Appreciate the response. From your example, one solution is that perhaps documenting one's custom range type with a disclaimer that its front method is expensive upon invocation is reasonable; the consumer of the range type, given this information, might then determine whether using cache is appropriate for them. I'm still left wondering whether one should strive for ensuring that front is O(1), and various solutions to this. With that said, I now realize my question is kind of ridiculous since the answer might be predicated upon other aspects: the type data being consumed; the common operations that are to be performed upon a custom range.
Jun 14 2021
On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.) At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)In English, `front` is a noun, `popFront` have a verb in it, so you know who should do more work :-)
Jun 14 2021
On Tuesday, 15 June 2021 at 04:57:45 UTC, mw wrote:In English, `front` is a noun, `popFront` have a verb in it, so you know who should do more work :-)Sure, but, for example, the front method of Phobos' MapResult is the one applying the predicate to the source's front. There's a clear separation of responsibilities between popFront and front: the former iterates over the source, and the latter performs an operation that produces the range's elements.
Jun 14 2021
On Tuesday, 15 June 2021 at 05:03:45 UTC, surlymoor wrote:On Tuesday, 15 June 2021 at 04:57:45 UTC, mw wrote:I think there is another convention (although it's not formally enforced, but should be) is: -- `obj.front() [should be] const`, i.e. it shouldn't modify `obj`, so can be called multiple times at any given state, and produce the same result -- `obj.popFront()` perform the actual mutation of the internal state the `obj`, each invocation will change the object's state once. e.g. https://dlang.org/library/std/range/primitives/front.html the 2nd decl: dchar front(T) ( scope const(T)[] a ) pure property safe if (isAutodecodableString!(T[])); you can see `const` but https://dlang.org/library/std/range/primitives/pop_front.html void popFront(C) ( scope ref inout(C)[] str ) pure nothrow trusted if (isAutodecodableString!(C[])); it's `ref inout`, which means the passed-in object is intended to be modified inside the method in-place.In English, `front` is a noun, `popFront` have a verb in it, so you know who should do more work :-)Sure, but, for example, the front method of Phobos' MapResult is the one applying the predicate to the source's front. There's a clear separation of responsibilities between popFront and front: the former iterates over the source, and the latter performs an operation that produces the range's elements.
Jun 14 2021
On Tuesday, 15 June 2021 at 05:17:06 UTC, mw wrote:I think there is another convention (although it's not formally enforced, but should be) is: -- `obj.front() [should be] const`, i.e. it shouldn't modify `obj`, so can be called multiple times at any given state, and produce the same result -- `obj.popFront()` perform the actual mutation of the internal state the `obj`, each invocation will change the object's state once.That's a separate matter. To use as an example once more, MapResult's front method produces the range's elements, and its invocation results in the identical element given that popFront isn't called; it doesn't affect the range's source. It satisfies those expectations, but it leaves wanting to know whether one should strive for front being inexpensive to call. My original question is stupid, and I think I'm just going to have to experiment.
Jun 14 2021
On 15.06.21 07:17, mw wrote:https://dlang.org/library/std/range/primitives/front.html the 2nd decl: dchar front(T) ( scope const(T)[] a ) pure property safe if (isAutodecodableString!(T[])); you can see `const` but https://dlang.org/library/std/range/primitives/pop_front.html void popFront(C) ( scope ref inout(C)[] str ) pure nothrow trusted if (isAutodecodableString!(C[])); it's `ref inout`, which means the passed-in object is intended to be modified inside the method in-place.`const` and `inout` mean the same thing here: Both functions cannot modify the elements of the array. The difference lies in `ref` alone. `front` receives a copy, so it can't change the length of the array you pass in. `popFront` receives by reference, so it can (and does).
Jun 14 2021
On 6/14/21 10:17 PM, mw wrote:I think there is another convention (although it's not formally enforced, but should be) is: -- `obj.front() [should be] const`, i.e. it shouldn't modify `obj`, so can be called multiple times at any given state, and produce the same resultIn other words, front() should be "idempotent". To the OP, there is the following presentation that is related and touches on similar concerns: https://forum.dlang.org/thread/diexjstekiyzgxlicnts forum.dlang.org Ali
Jun 15 2021
On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)The "general rule" is that `front` should return the same result on multiple calls until after the next call to `popFront`. I don't know of any requirement that such an operation be O(1), but the expectation is certainly there. The implication then is that any necessary work should be carried out in `popFront` to advance the range (and additionally in a constructor if things need to be teed up first) and that `front` should just return the current element.
Jun 14 2021
On Tuesday, 15 June 2021 at 04:24:09 UTC, surlymoor wrote:All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.) At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)It's a time-space tradeoff. As you say, caching requires additional space to store the cached element. On the other hand, *not* caching means that you spend unnecessary time computing the next element in cases where the range is only partially consumed. For example: ```d import std.range: generate, take; import std.algorithm: each; import std.stdio: writeln; generate!someExpensiveFunction.take(3).each!writeln; ``` Naively, you'd expect that `someExpensiveFunction` would be called 3 times--but it is actually called 4 times, because `generate` does its work in its constructor and `popFront` instead of in `front`.
Jun 15 2021
On Tue, Jun 15, 2021 at 02:20:11PM +0000, Paul Backus via Digitalmars-d-learn wrote: [...]It's a time-space tradeoff. As you say, caching requires additional space to store the cached element. On the other hand, *not* caching means that you spend unnecessary time computing the next element in cases where the range is only partially consumed. For example: ```d import std.range: generate, take; import std.algorithm: each; import std.stdio: writeln; generate!someExpensiveFunction.take(3).each!writeln; ``` Naively, you'd expect that `someExpensiveFunction` would be called 3 times--but it is actually called 4 times, because `generate` does its work in its constructor and `popFront` instead of in `front`.One way to address this is to make the computation lazy: the element is only computed once on demand, and once computed it's cached. But of course, this is probably overkill on many ranges. So the long answer is, it depends. :-) T -- It is not the employer who pays the wages. Employers only handle the money. It is the customer who pays the wages. -- Henry Ford
Jun 15 2021
On 6/15/21 12:24 AM, surlymoor wrote:All my custom range types perform all their meaningful work in their respective popFront methods, in addition to its expected source data iteration duties. The reason I do this is because I swear I read in a github discussion that front is expected to be O(1), and the only way I can think to achieve this is to stash the front element of a range in a private field; popFront would thus also set this field to a new value upon every call, and front would forward to it. (Or front would be the cache itself.) At the moment, I feel that as long as the stashed front element isn't too "big" (For some definition of big, I guess.), that built-in caching should be fine. But is this acceptable? What's the best practice for determining which range member should perform what work? (Other than iterating, of course.)IMO, `front` should not be figuring out which element is next. That is the job of `popFront`. But that doesn't mean `front` cannot do work. map is a prime example. If you use `map` though, you know what you are signing up for. `front` is expected to be called multiple times to get the same data. One thing to think about is that there is a `cache` wrapper range which can store it for you. This way, you give your users the option of caching or not. Each situation is different, and depends on the underlying mechanisms needed to make the range operate. -Steve
Jun 15 2021