www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - So why doesn't popFront return an element?

reply Andrej Mitrovic <none none.none> writes:
I'm trying to understand the design of ranges. Why does popFront only set the
front() property to return the next element in the range? Why not return the
element in the call to popFront right away?

For example code like this (which doesn't work since popFront doesn't return):
void main()
{
    int[] a = [1, 2];
    auto b = a.popFront;
    assert(a == [2]);
    assert(b == 1);
}

Isn't it wasteful to have to call both popFront() and front() to simultaneously
remove an element from a range and return it? I mean it's an extra function
call, right?
Apr 13 2011
next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Andrej Mitrovic Wrote:

 Isn't it wasteful to have to call both popFront() and front() to
simultaneously remove an element from a range and return it? I mean it's an
extra function call, right?
Isn't also a waste to return something that isn't going to be used? There are times it is important to just advance and either skip it or use the value in another location. In either situation I think the overhead is minimal.
Apr 13 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
 I'm trying to understand the design of ranges. Why does popFront only set
 the front() property to return the next element in the range? Why not
 return the element in the call to popFront right away?
 
 For example code like this (which doesn't work since popFront doesn't
 return): void main()
 {
 int[] a = [1, 2];
 auto b = a.popFront;
 assert(a == [2]);
 assert(b == 1);
 }
 
 Isn't it wasteful to have to call both popFront() and front() to
 simultaneously remove an element from a range and return it? I mean it's
 an extra function call, right?
Often, it would be wasteful to have popFront return an element, since it's perfectly legitimate to call popFront without wanting to ever see front. If the type of front is a struct, then it would have to be copied which could cost more than a function call. And both front and popFront would typically be inlined anyway (at least as long as you compile with -inline), so it's unlikely that it really costs you a function call anyway. Really, however, I believe that it's a matter of functionality. front and popFront do two very different things. One of them gives you the first element; the other removes the first element. You can call them both separately without wanting to do the other. There are plenty of cases where you want to call popFront multiple times without ever calling front, and there are plenty of cases where you want to call front without calling popFront. It _definitely_ would be a problem if front were removed in favor of having popFront return the front element, but even if you kept front and just made popFront return an element, it's conflating two separate operations into one function, which often is a bad idea. In the case of popFront, I often call popFront without wanting to look at front immediately if at all, and in most cases where you _do_ want to look at front every time, you're probably using a foreach loop and not calling popFront directly anyway. So, making popFront return an element wouldn't help you much. So, really, there's no need to make popFront return front, it's more likely to be _less_ efficient to do that, rather than more, and it's conflating two separate operations. I see little to no benefit to making popFront return anything, and it could be detrimental for it to. - Jonathan M Davis
Apr 13 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Oh, I thought the compiler could optimize calls to popFront if I'm not
assigning its value.

Doesn't a technique exist where if a function is called and I'm not
assigning its result anywhere it should simply exit the function
without returning anything? It seems like a valuable compiler
optimization, if that is even possible.
Apr 13 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Andrej Mitrovic:

 Oh, I thought the compiler could optimize calls to popFront if I'm not
 assigning its value.
 
 Doesn't a technique exist where if a function is called and I'm not
 assigning its result anywhere it should simply exit the function
 without returning anything? It seems like a valuable compiler
 optimization, if that is even possible.
The function may have different calling points, including having its pointer taken and passed around. So you generally need the full compiled function, that generates the output too. If the compiler is able to perform some kind of "whole program optimization" (today both GCC and LLVM are able to do some of it), and the compiler is able to infer that the result of popFront is never used in the whole program, then it might be able to remove the dead code that generates the output. If the compiler inlines popFront at a call point, and in this call point the result is not used, all compilers are usually able to remove the useless computations of the result. Some time ago I have proposed a __used_result boolean flag, usable inside functions. If inside a function you use this value (inside a "static if"), the function becomes a template, and it generally gets compiled two times in the final binary. If at a calling point the result of the function doesn't get used, the compiler calls the template instantiation where __used_result is false, so with the static if you are able to avoid computing the result. So this program: int count = 0; int foo(int x) { count++; static if (__used_result) { int result = x * x; return result; } } void main() { foo(10); int y = foo(20); } Gets rewritten by the compiler to something like: int count = 0; auto foo(bool use_return)(int x) { count++; static if (use_return) { int result = x * x; return result; } } void main() { foo!false(10); int y = foo!true(20); } That produces: _D4test11__T3fooVb0Z3fooFiZv comdat push EAX mov ECX,FS:__tls_array mov EDX,[ECX] inc dword ptr _D4test5counti[EDX] pop EAX ret _D4test11__T3fooVb1Z3fooFiZi comdat push EAX mov ECX,FS:__tls_array mov EDX,[ECX] inc dword ptr _D4test5counti[EDX] imul EAX,EAX pop ECX ret Bye, bearophile
Apr 13 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
 Oh, I thought the compiler could optimize calls to popFront if I'm not
 assigning its value.
 
 Doesn't a technique exist where if a function is called and I'm not
 assigning its result anywhere it should simply exit the function
 without returning anything? It seems like a valuable compiler
 optimization, if that is even possible.
That would depend on the type. For anything other than structs, if the function is inlined, it should be able to do that (if it's not inlined, it can't, because the returning is done inside of the function call). For structs, it can probably do that if there's no postblit constructor and none of its member variables have postblit constructors (or have member variables of their own which have postblit constructors, etc.). If there's a postblit constructor, however, it would have to be pure, otherwise it could have side effects, and not doing the return could then alter the behavior of the program. However, if it's pure, it probably can. I don't know what the compiler does exactly. Regardless, in the general case, the return value can't be optimized out, because that's done in the function call, not at the call site. Inlined functions should be able to get the optimization as long as it doesn't change the behavior of the program to do so, which could require that the postblit constructor is pure if a struct is being returned, and if you're returning an expression, then that expression is still going to have to be evaluated unless the compiler can determine that it has no side effects, which again, could require any functions involved to be pure. So, don't bet on such an optimization happening. dmd might manage it in some cases, but in the most obvious cases, the cost of the return is pretty low anyway (since it's either a primitive type or a reference). - Jonathan M Davis
Apr 13 2011
prev sibling next sibling parent spir <denis.spir gmail.com> writes:
On 04/14/2011 01:00 AM, Andrej Mitrovic wrote:
 I'm trying to understand the design of ranges. Why does popFront only set the
front() property to return the next element in the range? Why not return the
element in the call to popFront right away?

 For example code like this (which doesn't work since popFront doesn't return):
 void main()
 {
      int[] a = [1, 2];
      auto b = a.popFront;
      assert(a == [2]);
      assert(b == 1);
 }

 Isn't it wasteful to have to call both popFront() and front() to
simultaneously remove an element from a range and return it? I mean it's an
extra function call, right?
I like to have three members (even if not quite necessary, this cleanly separates notions). Why I don't understand is why empty and front are methods, not simple data members. Denis -- _________________ vita es estrany spir.wikidot.com
Apr 14 2011
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
 I'm trying to understand the design of ranges. Why does popFront only set the
front() property to return the next element in the range? Why not return the element in the call to popFront right away?
 For example code like this (which doesn't work since popFront doesn't return):
 void main()
 {
     int[] a = [1, 2];
     auto b = a.popFront;
     assert(a == [2]);
     assert(b == 1);
 }

 Isn't it wasteful to have to call both popFront() and front() to simultaneously
remove an element from a range and return it? I mean it's an extra function call, right? Those ranges are intended for functional programming! Most of the time, not returning anything means much much less work. Most ranges you'll encounter are _lazily_ evaluated. Eg: int[]a=[1,2,...]; auto b=map!foo(map!bar1(map!bar2(a)); auto c=chain(a,b); auto d=zip(a,b); auto e=zip(take(repeat(d),c.length));//note infinite range here... //etc, you get the idea Note that this code NEVER actually calls any of foo or bar1/bar2. (if this was not so, the code would enter an infinite loop at repeat(d)) It keeps not calling them when you call popFront(). (why would you want to do this anyways? I think it has only few applications, one of them is the implementation of opApply or new lazy range returning functions). When you now do e.front(), suddenly, a lot of stuff gets computed, so that you get a look at the first element of the range, again you usually don't really want to do this. It is more likely that you will call "take" or a fold/reduce on that range.
Apr 14 2011
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
This leads me to another question I've always wanted to ask. A call such as:

auto b=map!foo(map!bar1(map!bar2(a));

This constructs a lazy range. What I'm wondering is if there are any
performance issues when constructing long chains of ranges like that,
since this basically routes one function to the next, which routes to
the next, etc. E.g:

auto a = [1, 2];
auto b = retro(a);
auto c = retro(b);
auto d = retro(c);

Under the surface, I assume calling d.front would mean the following calls:
d.front()->c.back()->b.front()->a.back()

Or another example:
auto b = retro(retro(a));

Can the compiler optimize the second case and convert b.front to just
do one field access?
Apr 14 2011
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Apr 2011 12:57:10 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 This leads me to another question I've always wanted to ask. A call such  
 as:

 auto b=map!foo(map!bar1(map!bar2(a));

 This constructs a lazy range. What I'm wondering is if there are any
 performance issues when constructing long chains of ranges like that,
 since this basically routes one function to the next, which routes to
 the next
Of course. Ranges are very dependent on inlining for their performance benefit. Which means you are depending on the compiler inlining the code in order to achieve good performance. The compiler doesn't always inline, and I'm not sure how deep it can go.
 Or another example:
 auto b = retro(retro(a));

 Can the compiler optimize the second case and convert b.front to just
 do one field access?
from std.range: auto retro(Range)(Range r) if (isBidirectionalRange!(Unqual!Range)) { // Check for retro(retro(r)) and just return r in that case static if (is(typeof(retro(r.source)) == Range)) { return r.source; } ... So yeah, it can be optimized somewhat. -Steve
Apr 14 2011
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
 On Thu, 14 Apr 2011 12:57:10 -0400, Andrej Mitrovic
 
 <andrej.mitrovich gmail.com> wrote:
 This leads me to another question I've always wanted to ask. A call such
 as:
 
 auto b=map!foo(map!bar1(map!bar2(a));
 
 This constructs a lazy range. What I'm wondering is if there are any
 performance issues when constructing long chains of ranges like that,
 since this basically routes one function to the next, which routes to
 the next
Of course. Ranges are very dependent on inlining for their performance benefit. Which means you are depending on the compiler inlining the code in order to achieve good performance. The compiler doesn't always inline, and I'm not sure how deep it can go.
IIRC, I believe that I brought that up at one point and Walter said that it could inline infinitely deep (assuming that that it makes sense of course). However, I don't think that there's much question that the inliner could be improved. If nothing else, there are too many things in the language which currently preclude inlining, and as I understand it, the inliner could use some work on how well it does at inlining what it does inline. But I haven't studied it, so I don't know how good it ultimately is. - Jonathan M Davis
Apr 14 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Andrei Mitrovic:
 Can the compiler optimize the second case and convert b.front to just
do one field access? In simple cases, obviously yes: import std.range; import std.stdio; void main(){ int[] a=new int[1000]; auto b=retro(retro(a)); writeln(b.front); } Produces: .text:08094B64 public _Dmain .text:08094B64 _Dmain proc near ; CODE XREF: _D2rt6dmain24mainUiPPaZi7runMainMFZv+15p .text:08094B64 push ebp .text:08094B65 mov ebp, esp .text:08094B67 mov eax, offset _D11TypeInfo_Ai6__initZ .text:08094B6C push 3E8h .text:08094B71 push eax .text:08094B72 call _d_newarrayT .text:08094B77 add esp, 8 .text:08094B7A mov eax, offset _D3std5stdio6stdoutS3std5stdio4File .text:08094B7F push dword ptr [edx] .text:08094B81 push 0Ah .text:08094B83 call _D3std5stdio4File14__T5writeTiTaZ5writeMFiaZv .text:08094B88 xor eax, eax .text:08094B8A pop ebp .text:08094B8B retn .text:08094B8B _Dmain endp ; sp = -8 DMD even recognizes the fact, that edx contains the pointer to the first element of b. I do not know about more complex cases, I think it depends on DMD's inlining caps, which are (naturally) somewhat poorer than gcc's at the moment.
Apr 14 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Apr 2011 13:36:07 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 Andrei Mitrovic:
 Can the compiler optimize the second case and convert b.front to just
do one field access? In simple cases, obviously yes: import std.range; import std.stdio; void main(){ int[] a=new int[1000]; auto b=retro(retro(a)); writeln(b.front); } Produces: .text:08094B64 public _Dmain .text:08094B64 _Dmain proc near ; CODE XREF: _D2rt6dmain24mainUiPPaZi7runMainMFZv+15%19p .text:08094B64 push ebp .text:08094B65 mov ebp, esp .text:08094B67 mov eax, offset _D11TypeInfo_Ai6__initZ .text:08094B6C push 3E8h .text:08094B71 push eax .text:08094B72 call _d_newarrayT .text:08094B77 add esp, 8 .text:08094B7A mov eax, offset _D3std5stdio6stdoutS3std5stdio4File .text:08094B7F push dword ptr [edx] .text:08094B81 push 0Ah .text:08094B83 call _D3std5stdio4File14__T5writeTiTaZ5writeMFiaZv .text:08094B88 xor eax, eax .text:08094B8A pop ebp .text:08094B8B retn .text:08094B8B _Dmain endp ; sp = -8 DMD even recognizes the fact, that edx contains the pointer to the first element of b. I do not know about more complex cases, I think it depends on DMD's inlining caps, which are (naturally) somewhat poorer than gcc's at the moment.
Nah, it's simpler than that: assert(is(typeof(b) == int[])); which is done by retro (it detects if you're retro-ing a retro range, and just returns the original). -Steve
Apr 14 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
 Nah, it's simpler than that:

 assert(is(typeof(b) == int[]));

 which is done by retro (it detects if you're retro-ing a retro range, and
 just returns the original).

 -Steve
Makes sense. But still, I think inlining is responsible for the following code generating identical machine code: import std.range; import std.stdio; import std.algorithm; int id(int a){return a;} void main(){ int[] a=new int[1000]; auto b=retro(map!id(map!id(retro(a)))); writeln(b.front); } Correct me if I'm wrong, but I think phobos does only handle the special case retro(retro(x)) explicitly. -Timon
Apr 14 2011
prev sibling parent reply spir <denis.spir gmail.com> writes:
On 04/14/2011 06:57 PM, Andrej Mitrovic wrote:
 This leads me to another question I've always wanted to ask. A call such as:

 auto b=map!foo(map!bar1(map!bar2(a));

 This constructs a lazy range. What I'm wondering is if there are any
 performance issues when constructing long chains of ranges like that,
 since this basically routes one function to the next, which routes to
 the next, etc. E.g:

 auto a = [1, 2];
 auto b = retro(a);
 auto c = retro(b);
 auto d = retro(c);

 Under the surface, I assume calling d.front would mean the following calls:
 d.front()->c.back()->b.front()->a.back()

 Or another example:
 auto b = retro(retro(a));

 Can the compiler optimize the second case and convert b.front to just
 do one field access?
If it does optimise, then it is definitely a compiler bug. Since you *explicitely* ask for a double reverse, it *must* just do it. Suppressing them is here just breaking the language's semantics! It is not the compiler's role to interpret, meaning to guess your reasons for requesting that; and the compiler is certainly not in position to do it. Even less to *judge* that request as stupid, and thus ignore it (unlike in the army ;-). You are the master, asking for stupid computations is both your right and your problem ;-) Anyway, there are probably perfectly valid reasons to do a double reverse; at least (0) exploring (1) teaching (2) benchmarking. In a similar vein: { long n; static N = 10_000; foreach (_ ; 0..N) { n = factorial(9); } } Should the compiler optimise by computing n only once? (even possibly at compile-time) Then, what if I'm doing that in purpose? (be it stupid or not) Denis -- _________________ vita es estrany spir.wikidot.com
Apr 14 2011
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 14 Apr 2011 14:24:41 -0400, spir <denis.spir gmail.com> wrote:

 On 04/14/2011 06:57 PM, Andrej Mitrovic wrote:
 This leads me to another question I've always wanted to ask. A call  
 such as:

 auto b=map!foo(map!bar1(map!bar2(a));

 This constructs a lazy range. What I'm wondering is if there are any
 performance issues when constructing long chains of ranges like that,
 since this basically routes one function to the next, which routes to
 the next, etc. E.g:

 auto a = [1, 2];
 auto b = retro(a);
 auto c = retro(b);
 auto d = retro(c);

 Under the surface, I assume calling d.front would mean the following  
 calls:
 d.front()->c.back()->b.front()->a.back()

 Or another example:
 auto b = retro(retro(a));

 Can the compiler optimize the second case and convert b.front to just
 do one field access?
If it does optimise, then it is definitely a compiler bug. Since you *explicitely* ask for a double reverse, it *must* just do it. Suppressing them is here just breaking the language's semantics!
I feel like people aren't looking at my post :) retro *specifically* returns the source range if you retro it again. from std.range: auto retro(Range)(Range r) if (isBidirectionalRange!(Unqual!Range)) { // Check for retro(retro(r)) and just return r in that case static if (is(typeof(retro(r.source)) == Range)) { return r.source; } ... -Steve
Apr 14 2011
parent spir <denis.spir gmail.com> writes:
On 04/14/2011 08:33 PM, Steven Schveighoffer wrote:
 If it does optimise, then it is definitely a compiler bug. Since you
 *explicitely* ask for a double reverse, it *must* just do it. Suppressing
 them is here just breaking the language's semantics!
I feel like people aren't looking at my post :)
Sorry, I wrote this reply before reading yours. denis -- _________________ vita es estrany spir.wikidot.com
Apr 14 2011
prev sibling next sibling parent David Nadlinger <see klickverbot.at> writes:
On 4/14/11 8:24 PM, spir wrote:
 If it does optimise, then it is definitely a compiler bug. Since you
 *explicitely* ask for a double reverse, it *must* just do it.
 Suppressing them is here just breaking the language's semantics!
 […]
Sigh… First, as Steve already pointed out, retro() specifically shortcuts reversing a range twice to the original range. Second, your remarks about optimization are not quite correct, at least for most of today's high-level languages. A compiler is free to optimize a program in whatever possible way, as long as it can prove the result equivalent to the original. And as burning CPU-cycles is not part of the program semantics, a modern optimizing compiler is free to optimize the loop and factorial call away altogether. If you don't believe me, have a look at the attached sample program and the asm generated for main() (some recent Clang version, clang -O3). David --- #include <stdio.h> int factorial(int a) { int p = 1; for (int i = 2; i <= a; ++i) { p *= i; } return p; } int main (int argc, char const *argv[]) { int t; for (long i = 0; i < 10000000; ++i) { t = factorial(9); } printf("%d\n", t); return 0; } _main: 0000000100000ee0 pushq %rbp 0000000100000ee1 movq %rsp,%rbp 0000000100000ee4 leaq 0x00000041(%rip),%rdi 0000000100000eeb movl $0x00058980,%esi ; 9! in hex 0000000100000ef0 xorb %al,%al 0000000100000ef2 callq 0x100000f02; symbol stub for: _printf 0000000100000ef7 xorl %eax,%eax 0000000100000ef9 popq %rbp 0000000100000efa ret
Apr 14 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
 Should the compiler optimise by computing n only once? (even possibly at
compile-time)
 Then, what if I'm doing that in purpose? (be it stupid or not)

 Denis
What is the difference? The only thing that is affected by this optimization is execution speed. Optimizing is about implementing the requested semantics in an efficient matter. Why would you actually want to suppress completely safe optimizations? Also, when benchmarking, you don't usually want to benchmark unoptimized code. Or am I misunderstanding something? If absolutely necessary, you can always use the volatile keyword to disable all optimizations on a given variable, or call the function explicitly using inline ASM.
Apr 14 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Timon Gehr:

 If absolutely necessary, you can always use the volatile keyword to disable all
 optimizations on a given variable, or call the function explicitly using
inline ASM.
"volatile" is present only in D1... Bye, bearophile
Apr 14 2011
prev sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
 On 04/14/2011 01:00 AM, Andrej Mitrovic wrote:
 I'm trying to understand the design of ranges. Why does popFront only set
 the front() property to return the next element in the range? Why not
 return the element in the call to popFront right away?
 
 For example code like this (which doesn't work since popFront doesn't
 return): void main()
 {
 
 int[] a = [1, 2];
 auto b = a.popFront;
 assert(a == [2]);
 assert(b == 1);
 
 }
 
 Isn't it wasteful to have to call both popFront() and front() to
 simultaneously remove an element from a range and return it? I mean it's
 an extra function call, right?
I like to have three members (even if not quite necessary, this cleanly separates notions). Why I don't understand is why empty and front are methods, not simple data members.
They're properties. They could be member variables or they could be member functions. It's up to the implementation of a particular range as to whether they're member variables or member functions, though they're likely to be member functions in most cases. In most cases, there wouldn't be any member variable corresponding to empty (more likely, it's a check against the length of the range or does something to determine whether there's more left in the range rather than being a member variable indicating whether the range is empty), and even with front, there could easily be a calculation of some kind or additional checks (e.g. to make it throw if empty is true) in front. There is no requirement that front or empty be either functions or variables. They're properties, so they can be whatever makes the most sense for that particular range type. And as for arrays, they _have_ to be functions, since they're added by Phobos. - Jonathan M Davis
Apr 14 2011