www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - foreach, opApply, and inline

reply Tor Myklebust <tmyklebu csclub.uwaterloo.ca> writes:
When I compile the following code with dmd 1.015, separate functions are
generated for main, subset's opApply, and the body of the foreach.  Is
there a way to get dmd to inline opApply and the body of the foreach
when the opApply is known and the body of the foreach is known?  (More
generally, is it possible to declare a function accepting a delegate
argument so that the delegate will get inlined if it is known at compile
time?)

import std.stdio;

struct subset {
 int j;
 int opApply(int delegate(inout int i) foo) {
  for (int i = 0; i < j; i = (i+~j+1)&j) foo(i);
  return 0;
 }
}

void main() {
 foreach(i; subset(12345)) printf("%i\n", i);
}

The relevant assembly code is thus:

Disassembly of section .gnu.linkonce.t_D3foo6subset7opApplyMFDFKiZiZi:

0804a628 <_D3foo6subset7opApplyMFDFKiZiZi>:
 804a628:       55                      push   %ebp
 804a629:       8b ec                   mov    %esp,%ebp
 804a62b:       83 ec 10                sub    $0x10,%esp
 804a62e:       53                      push   %ebx
 804a62f:       56                      push   %esi
 804a630:       89 45 f8                mov    %eax,0xfffffff8(%ebp)
 804a633:       8b d8                   mov    %eax,%ebx
 804a635:       31 c9                   xor    %ecx,%ecx
 804a637:       89 4d fc                mov    %ecx,0xfffffffc(%ebp)
 804a63a:       39 0b                   cmp    %ecx,(%ebx)
 804a63c:       7e 33                   jle    804a671 <_D3foo6subset7opApplyMFD
FKiZiZi+0x49>
 804a63e:       89 5d f8                mov    %ebx,0xfffffff8(%ebp)
 804a641:       8b 55 0c                mov    0xc(%ebp),%edx
 804a644:       8b 45 08                mov    0x8(%ebp),%eax
 804a647:       89 d6                   mov    %edx,%esi
 804a649:       8b 5d f8                mov    0xfffffff8(%ebp),%ebx
 804a64c:       8d 4d fc                lea    0xfffffffc(%ebp),%ecx
 804a64f:       51                      push   %ecx
 804a650:       8b 45 08                mov    0x8(%ebp),%eax
 804a653:       ff d6                   call   *%esi
 804a655:       8b 13                   mov    (%ebx),%edx
 804a657:       89 55 f4                mov    %edx,0xfffffff4(%ebp)
 804a65a:       f7 d2                   not    %edx
 804a65c:       8b 4d fc                mov    0xfffffffc(%ebp),%ecx
 804a65f:       8d 54 11 01             lea    0x1(%ecx,%edx,1),%edx
 804a663:       8b 4d f4                mov    0xfffffff4(%ebp),%ecx
 804a666:       23 d1                   and    %ecx,%edx
 804a668:       89 55 fc                mov    %edx,0xfffffffc(%ebp)
 804a66b:       8b 0b                   mov    (%ebx),%ecx
 804a66d:       3b ca                   cmp    %edx,%ecx
 804a66f:       7f db                   jg     804a64c <_D3foo6subset7opApplyMFD
FKiZiZi+0x24>
 804a671:       31 c0                   xor    %eax,%eax
 804a673:       5e                      pop    %esi
 804a674:       5b                      pop    %ebx
 804a675:       8b e5                   mov    %ebp,%esp
 804a677:       5d                      pop    %ebp
 804a678:       c2 08 00                ret    $0x8
 804a67b:       90                      nop
Disassembly of section .gnu.linkonce.t_Dmain:

0804a67c <_Dmain>:
 804a67c:       55                      push   %ebp
 804a67d:       8b ec                   mov    %esp,%ebp
 804a67f:       83 ec 08                sub    $0x8,%esp
 804a682:       b8 a0 a6 04 08          mov    $0x804a6a0,%eax
 804a687:       50                      push   %eax
 804a688:       6a 00                   push   $0x0
 804a68a:       c7 45 fc 39 30 00 00    movl   $0x3039,0xfffffffc(%ebp)
 804a691:       8d 45 fc                lea    0xfffffffc(%ebp),%eax
 804a694:       e8 8f ff ff ff          call   804a628 <_D3foo6subset7opApplyMFD
FKiZiZi>
 804a699:       31 c0                   xor    %eax,%eax
 804a69b:       8b e5                   mov    %ebp,%esp
 804a69d:       5d                      pop    %ebp
 804a69e:       c3                      ret
 804a69f:       90                      nop
Disassembly of section .gnu.linkonce.t_D3foo4mainFZv14__foreachbody1MFKiZi:

0804a6a0 <_D3foo4mainFZv14__foreachbody1MFKiZi>:
 804a6a0:       55                      push   %ebp
 804a6a1:       8b ec                   mov    %esp,%ebp
 804a6a3:       50                      push   %eax
 804a6a4:       8b 4d 08                mov    0x8(%ebp),%ecx
 804a6a7:       ff 31                   pushl  (%ecx)
 804a6a9:       ba 68 b6 05 08          mov    $0x805b668,%edx
 804a6ae:       52                      push   %edx
 804a6af:       e8 3c f2 ff ff          call   80498f0 <printf plt>
 804a6b4:       83 c4 08                add    $0x8,%esp
 804a6b7:       31 c0                   xor    %eax,%eax
 804a6b9:       8b e5                   mov    %ebp,%esp
 804a6bb:       5d                      pop    %ebp
 804a6bc:       c2 04 00                ret    $0x4
 804a6bf:       90                      nop


Tor Myklebust
Oct 07 2007
parent reply BCS <ao pathlink.com> writes:
Reply to Tor,

 When I compile the following code with dmd 1.015, separate functions
 are generated for main, subset's opApply, and the body of the foreach.
 Is there a way to get dmd to inline opApply and the body of the
 foreach when the opApply is known and the body of the foreach is
 known?  (More generally, is it possible to declare a function
 accepting a delegate argument so that the delegate will get inlined if
 it is known at compile time?)
 
I assume you tried DMD's -inline flag? I have often though that the inlining should work in the more general case where small function get called with a known delegate.
Oct 07 2007
parent reply Tor Myklebust <tmyklebu csclub.uwaterloo.ca> writes:
BCS <ao pathlink.com> wrote:
 Reply to Tor,
 
 When I compile the following code with dmd 1.015, separate functions
 are generated for main, subset's opApply, and the body of the foreach.
 Is there a way to get dmd to inline opApply and the body of the
 foreach when the opApply is known and the body of the foreach is
 known?  (More generally, is it possible to declare a function
 accepting a delegate argument so that the delegate will get inlined if
 it is known at compile time?)
 
I assume you tried DMD's -inline flag?
Yes. That was compiled with -O -inline -release.
 I have often though that the inlining should work in the more general case 
 where small function get called with a known delegate.
I don't think it should in general; if the "small function" calls the delegate multiple times, this can result in extremely bloated code. For the specific case of opApply() happening because of a foreach loop, our intuition as programmers is that the compiled result won't contain any unnecessary function calls. (Imagine if "for (size_t i=0;i<n;i++) foo += bar[i];" generated a function for doing "foo += bar[i]" and a function for doing the iteration itself. You'd have to write for-loops yourself using if-goto again. That would suck, wouldn't it?) Tor Myklebust
Oct 07 2007
parent BCS <ao pathlink.com> writes:
Reply to Tor,

 BCS <ao pathlink.com> wrote:
 
 I have often though that the inlining should work in the more general
 case where small function get called with a known delegate.
 
I don't think it should in general; if the "small function" calls the delegate multiple times, this can result in extremely bloated code.
Well of course any optimization needs to be implemented with some sanity checks. What I want to be dealt with is the idiom where a call to a small function takes a delegate literal. Say something like this void Attr(char[] s)(void delegate() dg { writef("<"~s~"> "); scope(exit) writef("<"~s~"> "); dg(); } alias Attr!("b") Bold alias Attr!("http") Doc alias Attr!("body") Body alias Attr!("head") Head Doc({ Head({ ... }); Body({ writef("Hello "); Bold({ writef("world"); }); }); });
Oct 08 2007