www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Why is not inlining that bad?

reply "Janice Caron" <caron800 googlemail.com> writes:
Forgive me if this is a dumb question, but in other threads I've seen
it argued that dereferencing an address from a register offset is not
something that anyone needs to be worried about, and that array
accesses are so fast that no one need worry about them, etc.

This being the case, why is anyone worried about the overhead of a
function call? It's just a memory write and a few registers changing,
surely? It's not a massively expensive operation like a thread switch
or anything, so why worry?

If the hardware does memory caching, the return may not even need a
memory access.

What am I missing? Is it just that D initializes all its local
variables as part of calling a function? If so, there are plenty of
ways around that.
Oct 08 2007
next sibling parent Bruce Adams <tortoise_74 yeah.who.co.uk> writes:
Janice Caron Wrote:

 Forgive me if this is a dumb question, but in other threads I've seen
 it argued that dereferencing an address from a register offset is not
 something that anyone needs to be worried about, and that array
 accesses are so fast that no one need worry about them, etc.
 
 This being the case, why is anyone worried about the overhead of a
 function call? It's just a memory write and a few registers changing,
 surely? It's not a massively expensive operation like a thread switch
 or anything, so why worry?
 
 If the hardware does memory caching, the return may not even need a
 memory access.
 
 What am I missing? Is it just that D initializes all its local
 variables as part of calling a function? If so, there are plenty of
 ways around that.
Actually the overhead of a function call is significant, at least in some cases. You have to push all the variables onto the stack. If there are a lot of them and particularly if they are pass by value this makes a difference. You are changing the program counter to a different location which might well not be in the instruction cache. Anyway, if your loop is executed 10,000 times then this overhead may be significant. Also if the code is inline the compiler can optimise it as block. This might include moving some initialisation steps outside the loop. It can also ensure that variables stay in the same registers. In general you should prefer optimising your code more by changing the algorithm so the body of the loop is executed less often but sometimes inlining can make the difference. Regards, Bruce.
Oct 08 2007
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Janice Caron wrote:
 Forgive me if this is a dumb question, but in other threads I've seen
 it argued that dereferencing an address from a register offset is not
 something that anyone needs to be worried about, and that array
 accesses are so fast that no one need worry about them, etc.
 
 This being the case, why is anyone worried about the overhead of a
 function call? It's just a memory write and a few registers changing,
 surely? It's not a massively expensive operation like a thread switch
 or anything, so why worry?
 
 If the hardware does memory caching, the return may not even need a
 memory access.
 
 What am I missing? Is it just that D initializes all its local
 variables as part of calling a function? If so, there are plenty of
 ways around that.
Inlining a function, besides getting rid of the function call/return code, which can be significant, also enables interprocedural optimizations: register assignment, common subexpressions, constant folding, etc. It can result in dramatically fewer instructions being executed. Besides, it is more code memory cache friendly.
Oct 08 2007
parent reply 0ffh <spam frankhirsch.net> writes:
Walter Bright wrote:
 Inlining a function, besides getting rid of the function call/return 
 code, which can be significant, also enables interprocedural 
 optimizations: register assignment, common subexpressions, constant 
 folding, etc. It can result in dramatically fewer instructions being 
 executed. Besides, it is more code memory cache friendly.
Reminds me of: news://news.digitalmars.com:119/fdpmra$14n3$1 digitalmars.com Am I lucky when I, as BCS so elegantly put it, "say my prayers to the deities of optimization", and hope that debugfln(...){} will be reduced to even less than call/retn? Regards, frank
Oct 08 2007
next sibling parent 0ffh <spam frankhirsch.net> writes:
0ffh wrote:
 Am I lucky when I, as BCS so elegantly put it, "say my prayers to the
 deities of optimization", and hope that debugfln(...){} will be reduced
 to even less than call/retn?
Of cours Bill Baxter put it, my memory plays tricks on me! Sorry for the misassignment, both! :-) Regards, frank
Oct 08 2007
prev sibling parent BCS <BCS pathlink.com> writes:
0ffh wrote:
 Walter Bright wrote:
 
 Inlining a function, besides getting rid of the function call/return 
 code, which can be significant, also enables interprocedural 
 optimizations: register assignment, common subexpressions, constant 
 folding, etc. It can result in dramatically fewer instructions being 
 executed. Besides, it is more code memory cache friendly.
Reminds me of: news://news.digitalmars.com:119/fdpmra$14n3$1 digitalmars.com Am I lucky when I, as BCS so elegantly put it, "say my prayers to the deities of optimization", and hope that debugfln(...){} will be reduced to even less than call/retn? Regards, frank
BTW that was Bill Baxter, replying to me.
Oct 08 2007
prev sibling parent "Jb" <jb nowhere.com> writes:
"Janice Caron" <caron800 googlemail.com> wrote in message 
news:mailman.390.1191827338.16939.digitalmars-d puremagic.com...
 Forgive me if this is a dumb question, but in other threads I've seen
 it argued that dereferencing an address from a register offset is not
 something that anyone needs to be worried about, and that array
 accesses are so fast that no one need worry about them, etc.
Because adressing modes cost pretty much the same on modern x86 cpus. MOV EAX,[EBX+32+ECX*8] costs the same as MOV EAX,[$FFEECCDD] Because its an incredibly common thing to want to do it is optimized to the best possible case.
 This being the case, why is anyone worried about the overhead of a
 function call? It's just a memory write and a few registers changing,
 surely? It's not a massively expensive operation like a thread switch
 or anything, so why worry?
A call/ret pair, costs 2-4 cycles on average. Pushing poping parameters to the stack uses up 1 cycle per instruction aswell, best case scenario, and there is usualy some shuffling of data into different registers. And if the callee does anything more complicated than a handful of operations it's likely that it will need to dump stuff off to the stack to free up more registers. Plus avoiding all that allows more oportunity for optimization. If inlined then things dont have to be shuffled round to suit the calling convention, whether stack based or register based. So small functions with lots of parameters are the most likely beneficiaries.
Oct 08 2007