www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - inlining...

reply Manu <turkeyman gmail.com> writes:
--001a1136b9d46ae4b304f48b13ae
Content-Type: text/plain; charset=UTF-8

So, I'm constantly running into issues with not having control over inline.
I've run into it again doing experiments in preparation for my dconf talk...

I have identified 2 cases which come up regularly:
 1. A function that should always be inline unconditionally (std.simd is
effectively blocked on this)
 2. A particular invocation of a function should be inlined for this call
only

The first case it just about having control over code gen. Some functions
should effectively be macros or pseudo-intrinsics (ie, intrinsic wrappers
in std.simd, beauty wrappers around asm code, etc), and I don't ever want
to see a symbol appear in the binary.

My suggestion is introduction of __forceinline or something like it. We
need this.


The second case is interesting, and I've found it comes up a few times on
different occasions.
In my current instance, I'm trying to build generic framework to perform
efficient composable data processing, and a basic requirement is that the
components are inlined, such that the optimiser can interleave the work
properly.

Let's imagine I have a template which implements a work loop, which wants
to call a bunch of work elements it receives by alias. The issue is, each
of those must be inlined, for this call instance only, and there's no way
to do this.
I'm gonna draw the line at stringified code to use with mixin; I hate that,
and I don't want to encourage use of mixin or stringified code in
user-facing API's as a matter of practise. Also, some of these work
elements might be useful functions in their own right, which means they can
indeed be a function existing somewhere else that shouldn't itself be
attributed as __forceinline.

What are the current options to force that some code is inlined?

My feeling is that an ideal solution would be something like an enhancement
which would allow the 'mixin' keyword to be used with regular function
calls.
What this would do is 'mix in' the function call at this location, ie,
effectively inline that particular call, and it leverages a keyword and
concept that we already have. It would obviously produce a compile error of
the code is not available.

I quite like this idea, but there is a potential syntactical problem; how
to assign the return value?

int func(int y) { return y*y+10; }

int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get in
the way' if the output
int output = mixin(func(10)); // now i feel paren spammy...
mixin(int output = func(10)); // this doesn't feel right...

My feeling is the first is the best, but I'm not sure about that
grammatically.


The other thing that comes to mind is that it seems like this might make a
case for AST macros... but I think that's probably overkill for this
situation, and I'm not confident we're ever gonna attempt to crack that
nut. I'd like to see something practical and unobjectionable preferably.


This problem is fairly far reaching; phobos receives a lot of lambdas these
days, which I've found don't reliably inline and interfere with the
optimisers ability to optimise the code.
There was some discussion about a code unrolling API some time back, and
this would apply there (the suggested solution used string mixins! >_<).
Debug build performance is a problem which would be improved with this
feature.

--001a1136b9d46ae4b304f48b13ae
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">So, I&#39;m constantly running into issues with not having=
 control over inline.<div>I&#39;ve run into it again doing experiments in p=
reparation for my dconf talk...</div><div><br></div><div>I have identified =
2 cases which come up regularly:</div>
<div>=C2=A01. A function that should always be inline unconditionally (std.=
simd is effectively blocked on this)</div><div>=C2=A02. A particular invoca=
tion of a function should be inlined for this call only</div><div><br></div=
<div>

hould effectively be macros or pseudo-intrinsics (ie, intrinsic wrappers in= std.simd, beauty wrappers around asm code, etc), and I don&#39;t ever want= to see a symbol appear in the binary.</div> <div><br></div><div>My suggestion is introduction of __forceinline or somet= hing like it. We need this.<br></div><div><br></div><div><br></div><div>The= second case is interesting, and I&#39;ve found it comes up a few times on = different occasions.</div> <div>In my current instance, I&#39;m trying to build generic framework to p= erform efficient composable data processing, and a basic requirement is tha= t the components are inlined, such that the optimiser can interleave the wo= rk properly.</div> <div><br></div><div>Let&#39;s imagine I have a template which implements a = work loop, which wants to call a bunch of work elements it receives by alia= s. The issue is, each of those must be inlined, for this call instance only= , and there&#39;s no way to do this.</div> <div>I&#39;m gonna draw the line at stringified code to use with mixin; I h= ate that, and I don&#39;t want to encourage use of mixin or stringified cod= e in user-facing API&#39;s as a matter of practise. Also, some of these wor= k elements might be useful functions in their own right, which means they c= an indeed be a function existing somewhere else that shouldn&#39;t itself b= e attributed as __forceinline.<br> </div><div><br></div><div>What are the current options to force that some c= ode is inlined?</div><div><br></div><div>My feeling is that an ideal soluti= on would be something like an enhancement which would allow the &#39;mixin&= #39; keyword to be used with regular function calls.</div> <div>What this would do is &#39;mix in&#39; the function call at this locat= ion, ie, effectively inline that particular call, and it leverages a keywor= d and concept that we already have. It would obviously produce a compile er= ror of the code is not available.</div> <div><br></div><div>I quite like this idea, but there is a potential syntac= tical problem; how to assign the return value?</div><div><br></div><div>int= func(int y) { return y*y+10; }</div><div><br></div><div>int output =3D mix= in func(10); // the &#39;mixin&#39; keyword seems to kinda &#39;get in the = way&#39; if the output</div> <div>int output =3D mixin(func(10)); // now i feel paren spammy...</div><di= v>mixin(int output =3D func(10)); // this doesn&#39;t feel right...</div><d= iv><br></div><div>My feeling is the first is the best, but I&#39;m not sure= about that grammatically.</div> <div><br></div><div><br></div><div><div>The other thing that comes to mind = is that it seems like this might make a case for AST macros... but I think = that&#39;s probably overkill for this situation, and I&#39;m not confident = we&#39;re ever gonna attempt to crack that nut. I&#39;d like to see somethi= ng practical and unobjectionable preferably.</div> </div><div><br></div><div><br></div><div>This problem is fairly far reachin= g; phobos receives a lot of lambdas these days, which I&#39;ve found don&#3= 9;t reliably inline and interfere with the optimisers ability to optimise t= he code.</div> <div>There was some discussion about a code unrolling API some time back, a= nd this would apply there (the suggested solution used string mixins! &gt;_= &lt;).</div><div>Debug build performance is a problem which would be improv= ed with this feature.</div> </div> --001a1136b9d46ae4b304f48b13ae--
Mar 13 2014
next sibling parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 14 March 2014 at 06:21:27 UTC, Manu wrote:
 So, I'm constantly running into issues with not having control 
 over inline.
 I've run into it again doing experiments in preparation for my 
 dconf talk...

 I have identified 2 cases which come up regularly:
  1. A function that should always be inline unconditionally 
 (std.simd is
 effectively blocked on this)
  2. A particular invocation of a function should be inlined for 
 this call
 only

 The first case it just about having control over code gen. Some 
 functions
 should effectively be macros or pseudo-intrinsics (ie, 
 intrinsic wrappers
 in std.simd, beauty wrappers around asm code, etc), and I don't 
 ever want
 to see a symbol appear in the binary.

 My suggestion is introduction of __forceinline or something 
 like it. We
 need this.


 The second case is interesting, and I've found it comes up a 
 few times on
 different occasions.
 In my current instance, I'm trying to build generic framework 
 to perform
 efficient composable data processing, and a basic requirement 
 is that the
 components are inlined, such that the optimiser can interleave 
 the work
 properly.

 Let's imagine I have a template which implements a work loop, 
 which wants
 to call a bunch of work elements it receives by alias. The 
 issue is, each
 of those must be inlined, for this call instance only, and 
 there's no way
 to do this.
 I'm gonna draw the line at stringified code to use with mixin; 
 I hate that,
 and I don't want to encourage use of mixin or stringified code 
 in
 user-facing API's as a matter of practise. Also, some of these 
 work
 elements might be useful functions in their own right, which 
 means they can
 indeed be a function existing somewhere else that shouldn't 
 itself be
 attributed as __forceinline.

 What are the current options to force that some code is inlined?

 My feeling is that an ideal solution would be something like an 
 enhancement
 which would allow the 'mixin' keyword to be used with regular 
 function
 calls.
 What this would do is 'mix in' the function call at this 
 location, ie,
 effectively inline that particular call, and it leverages a 
 keyword and
 concept that we already have. It would obviously produce a 
 compile error of
 the code is not available.

 I quite like this idea, but there is a potential syntactical 
 problem; how
 to assign the return value?

 int func(int y) { return y*y+10; }

 int output = mixin func(10); // the 'mixin' keyword seems to 
 kinda 'get in
 the way' if the output
 int output = mixin(func(10)); // now i feel paren spammy...
 mixin(int output = func(10)); // this doesn't feel right...

 My feeling is the first is the best, but I'm not sure about that
 grammatically.


 The other thing that comes to mind is that it seems like this 
 might make a
 case for AST macros... but I think that's probably overkill for 
 this
 situation, and I'm not confident we're ever gonna attempt to 
 crack that
 nut. I'd like to see something practical and unobjectionable 
 preferably.


 This problem is fairly far reaching; phobos receives a lot of 
 lambdas these
 days, which I've found don't reliably inline and interfere with 
 the
 optimisers ability to optimise the code.
 There was some discussion about a code unrolling API some time 
 back, and
 this would apply there (the suggested solution used string 
 mixins! >_<).
 Debug build performance is a problem which would be improved 
 with this
 feature.

As much as I like the idea: Something always tells me this is the compilers job... What clever reasoning are you applying that the compiler's inliner can't? It seems like a different situation to say SIMD code, where correctly structuring loops can require a lot of gymnastics that the compiler can't or won't (floating point conformance) do. The inlining decision seems easily automatable in comparison. I understand that unoptimised builds for debugging are a problem, but a sensible compiler let's you hand pick your optimisation passes. In short: why are compilers not good enough at this that the programmer needs to be involved?
Mar 14 2014
parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On 3/14/2014 8:37 AM, Manu wrote:
 On 14 March 2014 22:02, John Colvin <john.loughran.colvin gmail.com> wrote:
 I don't know how good compilers are at taking this sort of thing into
 account already.

I don't know if they try or not, but I can say from experience that results are generally unreliable. I would never depend on the inliner to get this right.

I don't know how this compares to other inliners, but FWIW, DMD's inliner is pretty simple (By coincidence, I was just digging into it the other day): Every expression node (ie non-statement, non-declaration) in the function's AST adds 1 to the cost of inlining (so ex: 1+2*3 would have a cost of 2 - one mult, plus one addition). If the total cost is under 250, the function is inlined. Also, any type of AST node that isn't explicitly handled in inline.c will prevent a function from ever being inlined (since the ijnliner doesn't know how to inline it). I assume this is probably after lowerings are done, though, so more advanced constructs probably don't need to be explicitly handled. There is one other minor difficulty worth noting: When DMD wants to inline a function call, and the function's return value is actually used (ex: "auto x = foo();" or "1 + foo()"), the function must get inlined as an expression. Unfortunately, AIUI, a lot of D's statements can't be implemented inside an expression ATM (such as loops), so these would currently prevent such a function call from being inlined. I don't know how easy or difficult that would be to fix. Conceptually it should be simple: Create an Expression type StatementExp to wrap a Statement as an expression. But other parts of the backend would probably need to know about it, and I'm unfamiliar with the rest of the backend, so have no idea what that would/wouldn't entail. Not that it can't be done (AFAIK), but since the subject came up I thought I'd give a brief overview of the current DMD inliner, just FWIW.
Mar 14 2014
prev sibling next sibling parent "w0rp" <devw0rp gmail.com> writes:
On Friday, 14 March 2014 at 08:03:04 UTC, John Colvin wrote:
 As much as I like the idea:

 Something always tells me this is the compilers job... What 
 clever reasoning are you applying that the compiler's inliner 
 can't? It seems like a different situation to say SIMD code, 
 where correctly structuring loops can require a lot of 
 gymnastics that the compiler can't or won't (floating point 
 conformance) do. The inlining decision seems easily automatable 
 in comparison.

 I understand that unoptimised builds for debugging are a 
 problem, but a sensible compiler let's you hand pick your 
 optimisation passes.

 In short: why are compilers not good enough at this that the 
 programmer needs to be involved?

I think it's possible for a programmer to make a better decision about what to do than a compiler. Clearly the compiler isn't smart enough to make the right decisions for Manu now, so I think it would be acceptable to at least insert functionality to give him that control now until the compiler can. There is the question of whether or not it's possible for a compiler to make the right decisions in the right places, but I'm not experienced enough to address that.
Mar 14 2014
prev sibling next sibling parent "duh" <nothx yahoo.com> writes:
 Something always tells me this is the compilers job... What 
 clever reasoning are you applying that the compiler's inliner 
 can't? It seems like a different situation to say SIMD code, 
 where correctly structuring loops can require a lot of 
 gymnastics that the compiler can't or won't (floating point 
 conformance) do. The inlining decision seems easily automatable 
 in comparison.

 I understand that unoptimised builds for debugging are a 
 problem, but a sensible compiler let's you hand pick your 
 optimisation passes.

 In short: why are compilers not good enough at this that the 
 programmer needs to be involved?

No compiler gets this right 100% of the time, so if it is the compilers job they are failing. Most C++ compilers will sometimes require use of forceinline with SSE intrinsics. Unless it has PGO support the compiler has no idea about the runtime usage of that code. It wouldn't know which code the program spends 90% of its time in so it just applies general heuristics when deciding to inline. What I'd like is the ability to set a inline level per function. Something like 0 being always inline, and 10 being never inline. Unless specified otherwise, the default would be 5 So if you want forceinline behavior inline(0) vec3 dot(vec3 a, vec3 b); //always inlined inline(10) vec3 cross(vec3 a, vec3 b); //never inlined And override it at callsite-- inline(10) auto v = dot(a,b);
Mar 14 2014
prev sibling next sibling parent "Ethan" <gooberman gmail.com> writes:
On Friday, 14 March 2014 at 08:03:04 UTC, John Colvin wrote:
 Something always tells me this is the compilers job

If all methods are virtual by default, how can the compiler inline the code? Properties are a great example where I'd want to both final and inline them in quite a few cases. In those cases, the existence of inline would negate the need for final entirely because being a virtual method would never come in to the equation. This would also apply to UFCS functions, which I use to wrap D types such as strings in to C++ interface vtables without making the programmer jump through a bunch of hoops. Inline in Microsoft's compiler is always considered a strong hint. There are cases where even __forceinline won't actually inline a function if the compiler decides you're on crack. I assume this would be the case here, and you'd just be helping inform the compiler what you want inlined in case it slips up and gets it wrong.
Mar 14 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--047d7b4142c63fcb2f04f48f08ec
Content-Type: text/plain; charset=UTF-8

On 14 March 2014 18:03, John Colvin <john.loughran.colvin gmail.com> wrote:

 As much as I like the idea:

 Something always tells me this is the compilers job... What clever
 reasoning are you applying that the compiler's inliner can't? It seems like
 a different situation to say SIMD code, where correctly structuring loops
 can require a lot of gymnastics that the compiler can't or won't (floating
 point conformance) do. The inlining decision seems easily automatable in
 comparison.

 I understand that unoptimised builds for debugging are a problem, but a
 sensible compiler let's you hand pick your optimisation passes.

 In short: why are compilers not good enough at this that the programmer
 needs to be involved?

The compiler applies generalised heuristics, which are certainly for the 'common' case, whatever that happens to be. The compiler simply doesn't know what you're doing, so it's very hard for the compiler to do anything really intelligent. Inlining heuristics are fickle, and they also don't know what you're actually trying to do. Is a function 'long'? How long is 'long'? Is the function 'hot'? Do we prefer code size or execution speed? Is the function called only from this location, or is it used in many locations? Etc. Inlining is one of the most fuzzy pieces of logic in the compiler, and relies on a lot of information that is impossible for the compiler to deduce, so it applies heuristics to try and do a decent job, but it's certainly not perfect. I argue, nothing so fickle can exist in the language without having a manual override. Especially not in a native language. In my current case, the functions I need to inline are not exactly trivial. They're really pushing the boundaries of the compilers inliner heuristics, and then I'm calling a series of such functions that operate on parallel data. If they don't inline, the performance equals the sum of the functions plus some overhead. If they all inline, the performance is equal to only the longest one, and no overhead (the others fill in pipeline gaps). Further, some of these functions embed some shared work... if they don't inline, this work is repeated. If they do inline, the redundant repeated work is eliminated. My experiments with std.algorithm were a failure. I realised quickly that I couldn't rely on the inliner to do a satisfactory job, and the optimiser was unable to do it's job properly. std.algorithm could really benefit from the mixin suggestion since things like predicate functions are always trivial, usually supplied as little lambdas, and inlining isn't reliable. Especially in the debug builds. Something like algorithm loop sugar shouldn't run heaps worse than an explicit loop just because it happens to be implemented by a generic function. --047d7b4142c63fcb2f04f48f08ec Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 4 March 2014 18:03, John Colvin <span dir=3D"ltr">&lt;<a href=3D"mailto:joh= n.loughran.colvin gmail.com" target=3D"_blank">john.loughran.colvin gmail.c= om</a>&gt;</span> wrote:</div> <div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margi= n:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204= );border-left-style:solid;padding-left:1ex"> As much as I like the idea:<br> <br> Something always tells me this is the compilers job... What clever reasonin= g are you applying that the compiler&#39;s inliner can&#39;t? It seems like= a different situation to say SIMD code, where correctly structuring loops = can require a lot of gymnastics that the compiler can&#39;t or won&#39;t (f= loating point conformance) do. The inlining decision seems easily automatab= le in comparison.<br> <br> I understand that unoptimised builds for debugging are a problem, but a sen= sible compiler let&#39;s you hand pick your optimisation passes.<br> <br> In short: why are compilers not good enough at this that the programmer nee= ds to be involved?<br> </blockquote></div><br></div><div class=3D"gmail_extra"><div>The compiler a= pplies generalised heuristics, which are certainly for the &#39;common&#39;= case, whatever that happens to be.</div><div>The compiler simply doesn&#39= ;t know what you&#39;re doing, so it&#39;s very hard for the compiler to do= anything really intelligent.</div> <div><br></div><div>Inlining heuristics are fickle, and they also don&#39;t= know what you&#39;re actually trying to do.</div><div>Is a function &#39;l= ong&#39;? How long is &#39;long&#39;? Is the function &#39;hot&#39;? Do we = prefer code size or execution speed? Is the function called only from this = location, or is it used in many locations? Etc.</div> <div>Inlining is one of the most fuzzy pieces of logic in the compiler, and= relies on a lot of information that is impossible for the compiler to dedu= ce, so it applies heuristics to try and do a decent job, but it&#39;s certa= inly not perfect.</div> <div><br></div><div>I argue, nothing so fickle can exist in the language wi= thout having a manual override. Especially not in a native language.</div><= div><br></div><div>In my current case, the functions I need to inline are n= ot exactly trivial. They&#39;re really pushing the boundaries of the compil= ers inliner heuristics, and then I&#39;m calling a series of such functions= that operate on parallel data.</div> <div>If they don&#39;t inline, the performance equals the sum of the functi= ons plus some overhead. If they all inline, the performance is equal to onl= y the longest one, and no overhead (the others fill in pipeline gaps).</div=

&#39;t inline, this work is repeated. If they do inline, the redundant repe= ated work is eliminated.</div><div><br></div><div>My experiments with std.a= lgorithm were a failure. I realised quickly that I couldn&#39;t rely on the= inliner to do a satisfactory job, and the optimiser was unable to do it&#3= 9;s job properly.</div> <div>std.algorithm could really benefit from the mixin suggestion since thi= ngs like predicate functions are always trivial, usually supplied as little= lambdas, and inlining isn&#39;t reliable. Especially in the debug builds. = Something like algorithm loop sugar shouldn&#39;t run heaps worse than an e= xplicit loop just because it happens to be implemented by a generic functio= n.</div> </div></div> --047d7b4142c63fcb2f04f48f08ec--
Mar 14 2014
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Friday, 14 March 2014 at 11:04:34 UTC, Manu wrote:
 On 14 March 2014 18:03, John Colvin 
 <john.loughran.colvin gmail.com> wrote:

 As much as I like the idea:

 Something always tells me this is the compilers job... What 
 clever
 reasoning are you applying that the compiler's inliner can't? 
 It seems like
 a different situation to say SIMD code, where correctly 
 structuring loops
 can require a lot of gymnastics that the compiler can't or 
 won't (floating
 point conformance) do. The inlining decision seems easily 
 automatable in
 comparison.

 I understand that unoptimised builds for debugging are a 
 problem, but a
 sensible compiler let's you hand pick your optimisation passes.

 In short: why are compilers not good enough at this that the 
 programmer
 needs to be involved?

The compiler applies generalised heuristics, which are certainly for the 'common' case, whatever that happens to be. The compiler simply doesn't know what you're doing, so it's very hard for the compiler to do anything really intelligent. Inlining heuristics are fickle, and they also don't know what you're actually trying to do. Is a function 'long'? How long is 'long'? Is the function 'hot'? Do we prefer code size or execution speed? Is the function called only from this location, or is it used in many locations? Etc. Inlining is one of the most fuzzy pieces of logic in the compiler, and relies on a lot of information that is impossible for the compiler to deduce, so it applies heuristics to try and do a decent job, but it's certainly not perfect. I argue, nothing so fickle can exist in the language without having a manual override. Especially not in a native language. In my current case, the functions I need to inline are not exactly trivial. They're really pushing the boundaries of the compilers inliner heuristics, and then I'm calling a series of such functions that operate on parallel data. If they don't inline, the performance equals the sum of the functions plus some overhead. If they all inline, the performance is equal to only the longest one, and no overhead (the others fill in pipeline gaps). Further, some of these functions embed some shared work... if they don't inline, this work is repeated. If they do inline, the redundant repeated work is eliminated. My experiments with std.algorithm were a failure. I realised quickly that I couldn't rely on the inliner to do a satisfactory job, and the optimiser was unable to do it's job properly. std.algorithm could really benefit from the mixin suggestion since things like predicate functions are always trivial, usually supplied as little lambdas, and inlining isn't reliable. Especially in the debug builds. Something like algorithm loop sugar shouldn't run heaps worse than an explicit loop just because it happens to be implemented by a generic function.

Thanks for the explanations. Another use case is to aid propogation of compile-time information for optimisation. A function might look like a poor candidate for inlining for other reasons, but if there's a statically known (to the caller) integer parameter coming in that will be used to decide a loop length, inlining allows that info to be propogated to the callee. Static loop lengths => well optimised loops, with opportunities for optimal unrolling. Even with quite a large function this can be a good choice to inline. I don't know how good compilers are at taking this sort of thing into account already.
Mar 14 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
John Colvin:

 Another use case is to aid propogation of compile-time 
 information for optimisation.
 A function might look like a poor candidate for inlining for 
 other reasons, but if there's a statically known (to the 
 caller) integer parameter coming in that will be used to decide 
 a loop length, inlining allows that info to be propogated to 
 the callee. Static loop lengths => well optimised loops, with 
 opportunities for optimal unrolling. Even with quite a large 
 function this can be a good choice to inline.

If the function is private in a module, and it's called only from one point (or otherwise the loop count is the same in different calls), I think this optimization can be performed even if the function is not inlined. Bye, bearophile
Mar 14 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--089e0122991a870e2404f49053ad
Content-Type: text/plain; charset=UTF-8

On 14 March 2014 22:02, John Colvin <john.loughran.colvin gmail.com> wrote:
 Thanks for the explanations.

 Another use case is to aid propogation of compile-time information for
 optimisation.
 A function might look like a poor candidate for inlining for other
 reasons, but if there's a statically known (to the caller) integer
 parameter coming in that will be used to decide a loop length, inlining
 allows that info to be propogated to the callee. Static loop lengths =>
 well optimised loops, with opportunities for optimal unrolling. Even with
 quite a large function this can be a good choice to inline.

Yup, this is a classic example. Extremely relevant. And it's precisely the sort of thing that an inline heuristic is likely to fail at. I don't know how good compilers are at taking this sort of thing into
 account already.

I don't know if they try or not, but I can say from experience that results are generally unreliable. I would never depend on the inliner to get this right. On 14 March 2014 22:08, bearophile <bearophileHUGS lycos.com> wrote:
 John Colvin:


 ...

If the function is private in a module, and it's called only from one point (or otherwise the loop count is the same in different calls), I think this optimization can be performed even if the function is not inlined.

This is probably true, but I would never rely on it. You have some carefully tuned code that works well, and then one day, some random unrelated thing tweaks a balance, and your previously good code is suddenly slow for unknown reasons. The point is, there are times when you know your code should be inlined; ie, it's not an 'optimisation', it's an expectation/requirement. A programmer needs to be able to express this. --089e0122991a870e2404f49053ad Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 4 March 2014 22:02, John Colvin <span dir=3D"ltr">&lt;<a href=3D"mailto:joh= n.loughran.colvin gmail.com" target=3D"_blank">john.loughran.colvin gmail.c= om</a>&gt;</span> wrote:<blockquote class=3D"gmail_quote" style=3D"margin:0= px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);b= order-left-style:solid;padding-left:1ex"> <br> Thanks for the explanations.<br> <br> Another use case is to aid propogation of compile-time information for opti= misation.<br> A function might look like a poor candidate for inlining for other reasons,= but if there&#39;s a statically known (to the caller) integer parameter co= ming in that will be used to decide a loop length, inlining allows that inf= o to be propogated to the callee. Static loop lengths =3D&gt; well optimise= d loops, with opportunities for optimal unrolling. Even with quite a large = function this can be a good choice to inline.<br> </blockquote><div><br></div><div>Yup, this is a classic example. Extremely = relevant.</div><div>And it&#39;s precisely the sort of thing that an inline= heuristic is likely to fail at.</div><div><br></div><blockquote class=3D"g= mail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-= left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> I don&#39;t know how good compilers are at taking this sort of thing into a= ccount already.<br> </blockquote></div><br></div><div class=3D"gmail_extra">I don&#39;t know if= they try or not, but I can say from experience that results are generally = unreliable.</div><div class=3D"gmail_extra">I would never depend on the inl= iner to get this right.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra"><br></div><= div class=3D"gmail_extra">On 14 March 2014 22:08, bearophile=C2=A0<span dir= =3D"ltr">&lt;<a href=3D"mailto:bearophileHUGS lycos.com" target=3D"_blank">= bearophileHUGS lycos.com</a>&gt;</span>=C2=A0wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex">John Colvin:<div class=3D""><br><br><blockquote class=3D"g= mail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-= left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> ...<br></blockquote><br></div>If the function is private in a module, and i= t&#39;s called only from one point (or otherwise the loop count is the same= in different calls), I think this optimization can be performed even if th= e function is not inlined.<br> </blockquote><div><br></div><div>This is probably true, but I would never r= ely on it.</div><div>You have some carefully tuned code that works well, an= d then one day, some random unrelated thing tweaks a balance, and your prev= iously good code is suddenly slow for unknown reasons.</div> <div><br></div><div>The point is, there are times when you know your code s= hould be inlined; ie, it&#39;s not an &#39;optimisation&#39;, it&#39;s an e= xpectation/requirement. A programmer needs to be able to express this.</div=

--089e0122991a870e2404f49053ad--
Mar 14 2014
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2014-03-14 07:21, Manu wrote:
 So, I'm constantly running into issues with not having control over inline.
 I've run into it again doing experiments in preparation for my dconf talk...

 I have identified 2 cases which come up regularly:
   1. A function that should always be inline unconditionally (std.simd
 is effectively blocked on this)
   2. A particular invocation of a function should be inlined for this
 call only

 The first case it just about having control over code gen. Some
 functions should effectively be macros or pseudo-intrinsics (ie,
 intrinsic wrappers in std.simd, beauty wrappers around asm code, etc),
 and I don't ever want to see a symbol appear in the binary.

 My suggestion is introduction of __forceinline or something like it. We
 need this.

Haven't we already agreed a pragma for force inline should be implemented. Or is that something I have dreamed?
 The second case is interesting, and I've found it comes up a few times
 on different occasions.
 In my current instance, I'm trying to build generic framework to perform
 efficient composable data processing, and a basic requirement is that
 the components are inlined, such that the optimiser can interleave the
 work properly.

 Let's imagine I have a template which implements a work loop, which
 wants to call a bunch of work elements it receives by alias. The issue
 is, each of those must be inlined, for this call instance only, and
 there's no way to do this.
 I'm gonna draw the line at stringified code to use with mixin; I hate
 that, and I don't want to encourage use of mixin or stringified code in
 user-facing API's as a matter of practise. Also, some of these work
 elements might be useful functions in their own right, which means they
 can indeed be a function existing somewhere else that shouldn't itself
 be attributed as __forceinline.

 What are the current options to force that some code is inlined?

 My feeling is that an ideal solution would be something like an
 enhancement which would allow the 'mixin' keyword to be used with
 regular function calls.
 What this would do is 'mix in' the function call at this location, ie,
 effectively inline that particular call, and it leverages a keyword and
 concept that we already have. It would obviously produce a compile error
 of the code is not available.

 I quite like this idea, but there is a potential syntactical problem;
 how to assign the return value?

 int func(int y) { return y*y+10; }

 int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get

I think this is the best syntax of these three alternatives.
 in the way' if the output
 int output = mixin(func(10)); // now i feel paren spammy...

This syntax can't work. It's already interpreted calling "func" and use the result as a string mixin.
 mixin(int output = func(10)); // this doesn't feel right...

No.
 My feeling is the first is the best, but I'm not sure about that
 grammatically.

Yeah, I agree.
 The other thing that comes to mind is that it seems like this might make
 a case for AST macros... but I think that's probably overkill for this
 situation, and I'm not confident we're ever gonna attempt to crack that
 nut. I'd like to see something practical and unobjectionable preferably.

AST macros would solve it. It could solve the first use case as well. I would not implement AST macros just to support force inline but we have many other uses cases as well. I would have implement AST macros a long time ago. Hopefully this would avoid the need to create new language features in some cases. First use case, just define a macro that returns the AST for the content of the function you would create. macro func (Ast!(int) a) { return <[ $a * $a; ]>; } int output = func(10); // always inlined Second use case, define a macro, "inline", that takes the function you want to call as a parameter. The macro will basically inline the body. macro inline (T, U...) (Ast!(T function (U) func) { // this would probably be more complicated return func.body; } int output = func(10); // not inlined int output = inline(func(10)); // always inlined
 This problem is fairly far reaching; phobos receives a lot of lambdas
 these days, which I've found don't reliably inline and interfere with
 the optimisers ability to optimise the code.

I thought since lambdas are passed as template parameters they would always be inlined. -- /Jacob Carlborg
Mar 14 2014
next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2014-03-14 17:57:59 +0000, Jacob Carlborg <doob me.com> said:

 int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get

I think this is the best syntax of these three alternatives.

Maybe, but what does it do? Should it just inline the call to func? Or should it inline recursively every call inside func? Or maybe something in the middle? -- Michel Fortin michel.fortin michelf.ca http://michelf.ca
Mar 14 2014
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2014-03-14 19:02, Michel Fortin wrote:

 Maybe, but what does it do? Should it just inline the call to func? Or
 should it inline recursively every call inside func? Or maybe something
 in the middle?

I guess Manu needs to answer this one. -- /Jacob Carlborg
Mar 14 2014
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2014-03-19 09:35, Manu wrote:

 I don't already have it, otherwise I'd be making use of it. D has no
 control over the inliner. GDC/LDC offer attributes, but then it's really
 annoying that D has no mechanism to make use of compiler-specific
 attributes in a portable way (ie, attribute aliasing), so I can't make
 use of those without significantly interfering with my code.

Can't you create a tuple with different attributes depending on which compiler is currently compiling? Something like this: version (LDC) alias attributes = TypeTuple!( attribute("forceinline"); else version (GDC) alias attributes = TypeTuple!( attribute("forceinline")); else version (DigitalMars) alias attributes = TypeTuple!(); else static assert(false); (attributes) void foo () { } // This assume that "attributes" will be expanded -- /Jacob Carlborg
Mar 20 2014
prev sibling parent reply "Daniel Murphy" <yebbliesnospam gmail.com> writes:
"Manu" <turkeyman gmail.com> wrote in message 
news:mailman.128.1394856947.23258.digitalmars-d puremagic.com...

 Haven't we already agreed a pragma for force inline should be 
 implemented. Or is
 that something I have dreamed?

It's been discussed. I never agreed to it (I _really_ don't like it), but I'll take it if it's the best I'm gonna get. I don't like stateful attributes like that. I think it's error prone, especially when it's silent. 'private:' for instance will complain if you write a new function in an area influenced by the private state and try and call it from elsewhere; ie, you know you made the mistake. If you write a new function in an area influenced by the forceinline state which wasn't intended to be inlined, you won't know. I think that's dangerous.

Huh? The pragma could easily be restricted to apply to exactly one function declaration, if that's what's desired.
Mar 14 2014
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/17/14, 6:26 AM, Manu wrote:
 On 15 March 2014 14:55, Manu <turkeyman gmail.com
 <mailto:turkeyman gmail.com>> wrote:

     On 15 March 2014 14:33, Daniel Murphy <yebbliesnospam gmail.com
     <mailto:yebbliesnospam gmail.com>> wrote:

         "Manu" <turkeyman gmail.com <mailto:turkeyman gmail.com>> wrote
         in message
         news:mailman.128.1394856947.__23258.digitalmars-d puremagic.__com...

              > Haven't we already agreed a pragma for force inline
             should be > implemented. Or is
              > that something I have dreamed?

             It's been discussed. I never agreed to it (I _really_ don't
             like it), but I'll take it if it's the best
             I'm gonna get.

             I don't like stateful attributes like that. I think it's
             error prone, especially when it's silent.
             'private:' for instance will complain if you write a new
             function in an area influenced by the
             private state and try and call it from elsewhere; ie, you
             know you made the mistake.
             If you write a new function in an area influenced by the
             forceinline state which wasn't intended
             to be inlined, you won't know. I think that's dangerous.


         Huh?  The pragma could easily be restricted to apply to exactly
         one function declaration, if that's what's desired.


     Then why bother with a pragma?
     It's just a special case for the sake of a special case... I don't
     see why resist the language conventions. Where's the precedent for
     that? It just sounds like it's asking to cause edge cases and
     trouble down the line.
     Is it gonna get messy when it involves with templates? What about
     methods, sub-functions?


 *bump*
 I actually care about this a whole lot more than final-by-default right
 now ;)

 I'd like to think there's a possible solution to these problems that
 everyone agrees with.

I'd like to see a solution to inlining along the lines of "pliz pliz inline" (best effort) and "never inline". Outlining only at a specific call site is seldom needed and when it is it's trivially achievable with a noinline function forwarding to the inline function. Inlining only at a specific call site is a tall order and essentially impossible if header generation had been used. Andrei
Mar 17 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/17/14, 9:10 AM, Manu wrote:
 On 18 March 2014 01:36, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:


     I'd like to see a solution to inlining along the lines of "pliz pliz
     inline" (best effort) and "never inline".

     Outlining only at a specific call site is seldom needed and when it
     is it's trivially achievable with a noinline function forwarding to
     the inline function. Inlining only at a specific call site is a tall
     order and essentially impossible if header generation had been used.


 I don't follow, how does that work?
 It's the key innovation here. Since D doesn't have macros, I think it's
 something that really needs to be supported nicely.
 Obviously it's impossible if source is unavailable. It should give the
 same complaints that CTFE gives when source is unavailable.

The notion that a compiler can ask for any function to be inlined without the compiler having been "warned" in the function declaration makes me uncomfortable about feasibility. However, upon further thinking the same happens with CTFE. Andrei
Mar 17 2014
prev sibling next sibling parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 3/13/14, 11:21 PM, Manu wrote:
 My feeling is that an ideal solution would be something like an
 enhancement which would allow the 'mixin' keyword to be used with
 regular function calls.
 What this would do is 'mix in' the function call at this location, ie,
 effectively inline that particular call, and it leverages a keyword and
 concept that we already have. It would obviously produce a compile error
 of the code is not available.

 I quite like this idea, but there is a potential syntactical problem;
 how to assign the return value?

 int func(int y) { return y*y+10; }

 int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get
 in the way' if the output
 int output = mixin(func(10)); // now i feel paren spammy...
 mixin(int output = func(10)); // this doesn't feel right...

 My feeling is the first is the best, but I'm not sure about that
 grammatically.

Is there already some trait for getting the string value of a function including its code? If so then a mixin plus a small helper function might do the job. If not then is such a trait feasible?
Mar 14 2014
parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 14.03.2014 19:09, schrieb David Gileadi:
 On 3/13/14, 11:21 PM, Manu wrote:
 My feeling is that an ideal solution would be something like an
 enhancement which would allow the 'mixin' keyword to be used with
 regular function calls.
 What this would do is 'mix in' the function call at this location, ie,
 effectively inline that particular call, and it leverages a keyword and
 concept that we already have. It would obviously produce a compile error
 of the code is not available.

 I quite like this idea, but there is a potential syntactical problem;
 how to assign the return value?

 int func(int y) { return y*y+10; }

 int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get
 in the way' if the output
 int output = mixin(func(10)); // now i feel paren spammy...
 mixin(int output = func(10)); // this doesn't feel right...

 My feeling is the first is the best, but I'm not sure about that
 grammatically.

Is there already some trait for getting the string value of a function including its code? If so then a mixin plus a small helper function might do the job. If not then is such a trait feasible?

Might be problematic with modules delivered only in .di + binary form. -- Paulo
Mar 14 2014
parent David Gileadi <gileadis NSPMgmail.com> writes:
On 3/14/14, 1:42 PM, Paulo Pinto wrote:
 Am 14.03.2014 19:09, schrieb David Gileadi:
 On 3/13/14, 11:21 PM, Manu wrote:
 My feeling is that an ideal solution would be something like an
 enhancement which would allow the 'mixin' keyword to be used with
 regular function calls.
 What this would do is 'mix in' the function call at this location, ie,
 effectively inline that particular call, and it leverages a keyword and
 concept that we already have. It would obviously produce a compile error
 of the code is not available.

 I quite like this idea, but there is a potential syntactical problem;
 how to assign the return value?

 int func(int y) { return y*y+10; }

 int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get
 in the way' if the output
 int output = mixin(func(10)); // now i feel paren spammy...
 mixin(int output = func(10)); // this doesn't feel right...

 My feeling is the first is the best, but I'm not sure about that
 grammatically.

Is there already some trait for getting the string value of a function including its code? If so then a mixin plus a small helper function might do the job. If not then is such a trait feasible?

Might be problematic with modules delivered only in .di + binary form. -- Paulo

Quite, but as Manu says about his proposed solution,
 It would obviously produce a compile error
 of (sic) the code is not available.

This would need to behave similarly.
Mar 14 2014
prev sibling next sibling parent "Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:
On Friday, 14 March 2014 at 22:12:38 UTC, Nick Sabalausky wrote:
 On 3/14/2014 8:37 AM, Manu wrote:
 On 14 March 2014 22:02, John Colvin 
 <john.loughran.colvin gmail.com> wrote:
 I don't know how good compilers are at taking this sort of 
 thing into
 account already.

I don't know if they try or not, but I can say from experience that results are generally unreliable. I would never depend on the inliner to get this right.

I don't know how this compares to other inliners, but FWIW, DMD's inliner is pretty simple (By coincidence, I was just digging into it the other day): Every expression node (ie non-statement, non-declaration) in the function's AST adds 1 to the cost of inlining (so ex: 1+2*3 would have a cost of 2 - one mult, plus one addition). If the total cost is under 250, the function is inlined. Also, any type of AST node that isn't explicitly handled in inline.c will prevent a function from ever being inlined (since the ijnliner doesn't know how to inline it). I assume this is probably after lowerings are done, though, so more advanced constructs probably don't need to be explicitly handled. There is one other minor difficulty worth noting: When DMD wants to inline a function call, and the function's return value is actually used (ex: "auto x = foo();" or "1 + foo()"), the function must get inlined as an expression. Unfortunately, AIUI, a lot of D's statements can't be implemented inside an expression ATM (such as loops), so these would currently prevent such a function call from being inlined. I don't know how easy or difficult that would be to fix. Conceptually it should be simple: Create an Expression type StatementExp to wrap a Statement as an expression. But other parts of the backend would probably need to know about it, and I'm unfamiliar with the rest of the backend, so have no idea what that would/wouldn't entail. Not that it can't be done (AFAIK), but since the subject came up I thought I'd give a brief overview of the current DMD inliner, just FWIW.

Probably one easy adjustment that would result in a lot of gain in optimization would be to bump the lower bound of 250 if the function is an operator overload.
Mar 14 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--f46d0444ea9909c9ab04f49d7085
Content-Type: text/plain; charset=UTF-8

On 15 March 2014 03:57, Jacob Carlborg <doob me.com> wrote:

 On 2014-03-14 07:21, Manu wrote:

 So, I'm constantly running into issues with not having control over
 inline.
 I've run into it again doing experiments in preparation for my dconf
 talk...

 I have identified 2 cases which come up regularly:
   1. A function that should always be inline unconditionally (std.simd
 is effectively blocked on this)
   2. A particular invocation of a function should be inlined for this
 call only

 The first case it just about having control over code gen. Some
 functions should effectively be macros or pseudo-intrinsics (ie,
 intrinsic wrappers in std.simd, beauty wrappers around asm code, etc),
 and I don't ever want to see a symbol appear in the binary.

 My suggestion is introduction of __forceinline or something like it. We
 need this.

Haven't we already agreed a pragma for force inline should be implemented. Or is that something I have dreamed?

It's been discussed. I never agreed to it (I _really_ don't like it), but I'll take it if it's the best I'm gonna get. I don't like stateful attributes like that. I think it's error prone, especially when it's silent. 'private:' for instance will complain if you write a new function in an area influenced by the private state and try and call it from elsewhere; ie, you know you made the mistake. If you write a new function in an area influenced by the forceinline state which wasn't intended to be inlined, you won't know. I think that's dangerous. The second case is interesting, and I've found it comes up a few times
 on different occasions.
 In my current instance, I'm trying to build generic framework to perform
 efficient composable data processing, and a basic requirement is that
 the components are inlined, such that the optimiser can interleave the
 work properly.

 Let's imagine I have a template which implements a work loop, which
 wants to call a bunch of work elements it receives by alias. The issue
 is, each of those must be inlined, for this call instance only, and
 there's no way to do this.
 I'm gonna draw the line at stringified code to use with mixin; I hate
 that, and I don't want to encourage use of mixin or stringified code in
 user-facing API's as a matter of practise. Also, some of these work
 elements might be useful functions in their own right, which means they
 can indeed be a function existing somewhere else that shouldn't itself
 be attributed as __forceinline.

 What are the current options to force that some code is inlined?

 My feeling is that an ideal solution would be something like an
 enhancement which would allow the 'mixin' keyword to be used with
 regular function calls.
 What this would do is 'mix in' the function call at this location, ie,
 effectively inline that particular call, and it leverages a keyword and
 concept that we already have. It would obviously produce a compile error
 of the code is not available.

 I quite like this idea, but there is a potential syntactical problem;
 how to assign the return value?

 int func(int y) { return y*y+10; }

 int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get

I think this is the best syntax of these three alternatives. in the way' if the output
 int output = mixin(func(10)); // now i feel paren spammy...

This syntax can't work. It's already interpreted calling "func" and use the result as a string mixin. mixin(int output = func(10)); // this doesn't feel right...

No. My feeling is the first is the best, but I'm not sure about that
 grammatically.

Yeah, I agree.

So you think it's grammatically okay? The other thing that comes to mind is that it seems like this might make
 a case for AST macros... but I think that's probably overkill for this
 situation, and I'm not confident we're ever gonna attempt to crack that
 nut. I'd like to see something practical and unobjectionable preferably.

AST macros would solve it. It could solve the first use case as well. I would not implement AST macros just to support force inline but we have many other uses cases as well. I would have implement AST macros a long time ago. Hopefully this would avoid the need to create new language features in some cases. First use case, just define a macro that returns the AST for the content of the function you would create. macro func (Ast!(int) a) { return <[ $a * $a; ]>; } int output = func(10); // always inlined Second use case, define a macro, "inline", that takes the function you want to call as a parameter. The macro will basically inline the body. macro inline (T, U...) (Ast!(T function (U) func) { // this would probably be more complicated return func.body; } int output = func(10); // not inlined int output = inline(func(10)); // always inlined This problem is fairly far reaching; phobos receives a lot of lambdas
 these days, which I've found don't reliably inline and interfere with
 the optimisers ability to optimise the code.

I thought since lambdas are passed as template parameters they would always be inlined.

Maybe... (and not in debug builds). Without explicit control of the inliner, you just never know. --f46d0444ea9909c9ab04f49d7085 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 5 March 2014 03:57, Jacob Carlborg <span dir=3D"ltr">&lt;<a href=3D"mailto:= doob me.com" target=3D"_blank">doob me.com</a>&gt;</span> wrote:<br><blockq= uote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc = solid;padding-left:1ex"> <div class=3D"">On 2014-03-14 07:21, Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> So, I&#39;m constantly running into issues with not having control over inl= ine.<br> I&#39;ve run into it again doing experiments in preparation for my dconf ta= lk...<br> <br> I have identified 2 cases which come up regularly:<br> =C2=A0 1. A function that should always be inline unconditionally (std.simd= <br> is effectively blocked on this)<br> =C2=A0 2. A particular invocation of a function should be inlined for this<= br> call only<br> <br> The first case it just about having control over code gen. Some<br> functions should effectively be macros or pseudo-intrinsics (ie,<br> intrinsic wrappers in std.simd, beauty wrappers around asm code, etc),<br> and I don&#39;t ever want to see a symbol appear in the binary.<br> <br> My suggestion is introduction of __forceinline or something like it. We<br> need this.<br> </blockquote> <br></div> Haven&#39;t we already agreed a pragma for force inline should be implement= ed. Or is that something I have dreamed?</blockquote><div><br></div><div>It= &#39;s been discussed. I never agreed to it (I _really_ don&#39;t like it),= but I&#39;ll take it if it&#39;s the best I&#39;m gonna get.</div> <div><br></div><div>I don&#39;t like stateful attributes like that. I think= it&#39;s error prone, especially when it&#39;s silent.<br></div><div>&#39;= private:&#39; for instance will complain if you write a new function in an = area influenced by the private state and try and call it from elsewhere; ie= , you know you made the mistake.<br> </div><div>If you write a new function in an area influenced by the forcein= line state which wasn&#39;t intended to be inlined, you won&#39;t know. I t= hink that&#39;s dangerous.</div><div><br></div><blockquote class=3D"gmail_q= uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e= x"> <div><div class=3D"h5"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> The second case is interesting, and I&#39;ve found it comes up a few times<= br> on different occasions.<br> In my current instance, I&#39;m trying to build generic framework to perfor= m<br> efficient composable data processing, and a basic requirement is that<br> the components are inlined, such that the optimiser can interleave the<br> work properly.<br> <br> Let&#39;s imagine I have a template which implements a work loop, which<br> wants to call a bunch of work elements it receives by alias. The issue<br> is, each of those must be inlined, for this call instance only, and<br> there&#39;s no way to do this.<br> I&#39;m gonna draw the line at stringified code to use with mixin; I hate<b= r> that, and I don&#39;t want to encourage use of mixin or stringified code in= <br> user-facing API&#39;s as a matter of practise. Also, some of these work<br> elements might be useful functions in their own right, which means they<br> can indeed be a function existing somewhere else that shouldn&#39;t itself<= br> be attributed as __forceinline.<br> <br> What are the current options to force that some code is inlined?<br> <br> My feeling is that an ideal solution would be something like an<br> enhancement which would allow the &#39;mixin&#39; keyword to be used with<b= r> regular function calls.<br> What this would do is &#39;mix in&#39; the function call at this location, = ie,<br> effectively inline that particular call, and it leverages a keyword and<br> concept that we already have. It would obviously produce a compile error<br=

<br> I quite like this idea, but there is a potential syntactical problem;<br> how to assign the return value?<br> <br> int func(int y) { return y*y+10; }<br> <br> int output =3D mixin func(10); // the &#39;mixin&#39; keyword seems to kind= a &#39;get<br> </blockquote> <br></div></div> I think this is the best syntax of these three alternatives.<div class=3D""=
<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> in the way&#39; if the output<br> int output =3D mixin(func(10)); // now i feel paren spammy...<br> </blockquote> <br></div> This syntax can&#39;t work. It&#39;s already interpreted calling &quot;func= &quot; and use the result as a string mixin.<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> mixin(int output =3D func(10)); // this doesn&#39;t feel right...<br> </blockquote> <br></div> No.<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> My feeling is the first is the best, but I&#39;m not sure about that<br> grammatically.<br> </blockquote> <br></div> Yeah, I agree.</blockquote><div><br></div><div>So you think it&#39;s gramma= tically okay?</div><div><br></div><blockquote class=3D"gmail_quote" style= =3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div cla= ss=3D""> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> The other thing that comes to mind is that it seems like this might make<br=

br> situation, and I&#39;m not confident we&#39;re ever gonna attempt to crack = that<br> nut. I&#39;d like to see something practical and unobjectionable preferably= .<br> </blockquote> <br></div> AST macros would solve it. It could solve the first use case as well. I wou= ld not implement AST macros just to support force inline but we have many o= ther uses cases as well. I would have implement AST macros a long time ago.= Hopefully this would avoid the need to create new language features in som= e cases.<br> <br> First use case, just define a macro that returns the AST for the content of= the function you would create.<br> <br> macro func (Ast!(int) a)<br> {<br> =C2=A0 =C2=A0 return &lt;[ $a * $a; ]&gt;;<br> }<br> <br> int output =3D func(10); // always inlined<br> <br> Second use case, define a macro, &quot;inline&quot;, that takes the functio= n you want to call as a parameter. The macro will basically inline the body= .<br> <br> macro inline (T, U...) (Ast!(T function (U) func)<br> {<br> =C2=A0 =C2=A0 // this would probably be more complicated<br> =C2=A0 =C2=A0 return func.body;<br> }<br> <br> int output =3D func(10); // not inlined<br> int output =3D inline(func(10)); // always inlined<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> This problem is fairly far reaching; phobos receives a lot of lambdas<br> these days, which I&#39;ve found don&#39;t reliably inline and interfere wi= th<br> the optimisers ability to optimise the code.<br> </blockquote> <br></div> I thought since lambdas are passed as template parameters they would always= be inlined.</blockquote><div><br></div><div>Maybe... (and not in debug bui= lds). Without explicit control of the inliner, you just never know.</div> </div></div></div> --f46d0444ea9909c9ab04f49d7085--
Mar 14 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--001a1136a1c6c3441504f49d744d
Content-Type: text/plain; charset=UTF-8

On 15 March 2014 04:02, Michel Fortin <michel.fortin michelf.ca> wrote:

 On 2014-03-14 17:57:59 +0000, Jacob Carlborg <doob me.com> said:

  int output = mixin func(10); // the 'mixin' keyword seems to kinda 'get

I think this is the best syntax of these three alternatives.

Maybe, but what does it do? Should it just inline the call to func? Or should it inline recursively every call inside func? Or maybe something in the middle?

I'd say it should inline only func. Any sub-calls are subject to the regular inline heuristics. --001a1136a1c6c3441504f49d744d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 5 March 2014 04:02, Michel Fortin <span dir=3D"ltr">&lt;<a href=3D"mailto:m= ichel.fortin michelf.ca" target=3D"_blank">michel.fortin michelf.ca</a>&gt;= </span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D"">On 2014-03-14 17:57:59 +0000= , Jacob Carlborg &lt;<a href=3D"mailto:doob me.com" target=3D"_blank">doob = me.com</a>&gt; said:<br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><blockquote class=3D"gmail_quote" style=3D"m= argin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> int output =3D mixin func(10); // the &#39;mixin&#39; keyword seems to kind= a &#39;get<br> </blockquote> <br> I think this is the best syntax of these three alternatives.<br> </blockquote> <br></div> Maybe, but what does it do? Should it just inline the call to func? Or shou= ld it inline recursively every call inside func? Or maybe something in the = middle?</blockquote><div><br></div><div>I&#39;d say it should inline only f= unc. Any sub-calls are subject to the regular inline heuristics.</div> </div></div></div> --001a1136a1c6c3441504f49d744d--
Mar 14 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--001a11c2aa6aa2c6e504f49dfff7
Content-Type: text/plain; charset=UTF-8

On 15 March 2014 14:33, Daniel Murphy <yebbliesnospam gmail.com> wrote:

 "Manu" <turkeyman gmail.com> wrote in message news:mailman.128.1394856947.
 23258.digitalmars-d puremagic.com...

  > Haven't we already agreed a pragma for force inline should be >
 implemented. Or is
 that something I have dreamed?

It's been discussed. I never agreed to it (I _really_ don't like it), but I'll take it if it's the best I'm gonna get. I don't like stateful attributes like that. I think it's error prone, especially when it's silent. 'private:' for instance will complain if you write a new function in an area influenced by the private state and try and call it from elsewhere; ie, you know you made the mistake. If you write a new function in an area influenced by the forceinline state which wasn't intended to be inlined, you won't know. I think that's dangerous.

Huh? The pragma could easily be restricted to apply to exactly one function declaration, if that's what's desired.

Then why bother with a pragma? It's just a special case for the sake of a special case... I don't see why resist the language conventions. Where's the precedent for that? It just sounds like it's asking to cause edge cases and trouble down the line. Is it gonna get messy when it involves with templates? What about methods, sub-functions? --001a11c2aa6aa2c6e504f49dfff7 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 5 March 2014 14:33, Daniel Murphy <span dir=3D"ltr">&lt;<a href=3D"mailto:y= ebbliesnospam gmail.com" target=3D"_blank">yebbliesnospam gmail.com</a>&gt;= </span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex">&quot;Manu&quot; &lt;<a href=3D"mailto:turke= yman gmail.com" target=3D"_blank">turkeyman gmail.com</a>&gt; wrote in mess= age news:mailman.128.1394856947.<u></u>23258.digitalmars-d puremagic.<u></u=
com...<br>

<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D""> &gt; Haven&#39;t we already agreed a pragma for force inline should be &gt;= implemented. Or is<br> &gt; that something I have dreamed?<br> <br></div><div class=3D""> It&#39;s been discussed. I never agreed to it (I _really_ don&#39;t like it= ), but I&#39;ll take it if it&#39;s the best<br> I&#39;m gonna get.<br> <br> I don&#39;t like stateful attributes like that. I think it&#39;s error pron= e, especially when it&#39;s silent.<br> &#39;private:&#39; for instance will complain if you write a new function i= n an area influenced by the<br> private state and try and call it from elsewhere; ie, you know you made the= mistake.<br> If you write a new function in an area influenced by the forceinline state = which wasn&#39;t intended<br> to be inlined, you won&#39;t know. I think that&#39;s dangerous.<br> </div></blockquote> <br> Huh? =C2=A0The pragma could easily be restricted to apply to exactly one fu= nction declaration, if that&#39;s what&#39;s desired. <br> </blockquote></div><br></div><div class=3D"gmail_extra">Then why bother wit= h a pragma?</div><div class=3D"gmail_extra">It&#39;s just a special case fo= r the sake of a special case... I don&#39;t see why resist the language con= ventions. Where&#39;s the precedent for that? It just sounds like it&#39;s = asking to cause edge cases and trouble down the line.</div> <div class=3D"gmail_extra">Is it gonna get messy when it involves with temp= lates? What about methods, sub-functions?</div></div> --001a11c2aa6aa2c6e504f49dfff7--
Mar 14 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--089e013cb9463fc01204f4cd5e3c
Content-Type: text/plain; charset=UTF-8

On 15 March 2014 14:55, Manu <turkeyman gmail.com> wrote:

 On 15 March 2014 14:33, Daniel Murphy <yebbliesnospam gmail.com> wrote:

 "Manu" <turkeyman gmail.com> wrote in message
 news:mailman.128.1394856947.23258.digitalmars-d puremagic.com...

  > Haven't we already agreed a pragma for force inline should be >
 implemented. Or is
 that something I have dreamed?

It's been discussed. I never agreed to it (I _really_ don't like it), but I'll take it if it's the best I'm gonna get. I don't like stateful attributes like that. I think it's error prone, especially when it's silent. 'private:' for instance will complain if you write a new function in an area influenced by the private state and try and call it from elsewhere; ie, you know you made the mistake. If you write a new function in an area influenced by the forceinline state which wasn't intended to be inlined, you won't know. I think that's dangerous.

Huh? The pragma could easily be restricted to apply to exactly one function declaration, if that's what's desired.

Then why bother with a pragma? It's just a special case for the sake of a special case... I don't see why resist the language conventions. Where's the precedent for that? It just sounds like it's asking to cause edge cases and trouble down the line. Is it gonna get messy when it involves with templates? What about methods, sub-functions?

*bump* I actually care about this a whole lot more than final-by-default right now ;) I'd like to think there's a possible solution to these problems that everyone agrees with. --089e013cb9463fc01204f4cd5e3c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 5 March 2014 14:55, Manu <span dir=3D"ltr">&lt;<a href=3D"mailto:turkeyman = gmail.com" target=3D"_blank">turkeyman gmail.com</a>&gt;</span> wrote:<br><= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex"> <div dir=3D"ltr"><div><div class=3D"h5"><div class=3D"gmail_extra"><div cla= ss=3D"gmail_quote">On 15 March 2014 14:33, Daniel Murphy <span dir=3D"ltr">= &lt;<a href=3D"mailto:yebbliesnospam gmail.com" target=3D"_blank">yebbliesn= ospam gmail.com</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex">&quot;Manu&quot; &lt;<a href=3D"mailto:turke= yman gmail.com" target=3D"_blank">turkeyman gmail.com</a>&gt; wrote in mess= age news:mailman.128.1394856947.<u></u>23258.digitalmars-d puremagic.<u></u=
com...<br>

<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div> &gt; Haven&#39;t we already agreed a pragma for force inline should be &gt;= implemented. Or is<br> &gt; that something I have dreamed?<br> <br></div><div> It&#39;s been discussed. I never agreed to it (I _really_ don&#39;t like it= ), but I&#39;ll take it if it&#39;s the best<br> I&#39;m gonna get.<br> <br> I don&#39;t like stateful attributes like that. I think it&#39;s error pron= e, especially when it&#39;s silent.<br> &#39;private:&#39; for instance will complain if you write a new function i= n an area influenced by the<br> private state and try and call it from elsewhere; ie, you know you made the= mistake.<br> If you write a new function in an area influenced by the forceinline state = which wasn&#39;t intended<br> to be inlined, you won&#39;t know. I think that&#39;s dangerous.<br> </div></blockquote> <br> Huh? =C2=A0The pragma could easily be restricted to apply to exactly one fu= nction declaration, if that&#39;s what&#39;s desired. <br> </blockquote></div><br></div></div></div><div class=3D"gmail_extra">Then wh= y bother with a pragma?</div><div class=3D"gmail_extra">It&#39;s just a spe= cial case for the sake of a special case... I don&#39;t see why resist the = language conventions. Where&#39;s the precedent for that? It just sounds li= ke it&#39;s asking to cause edge cases and trouble down the line.</div> <div class=3D"gmail_extra">Is it gonna get messy when it involves with temp= lates? What about methods, sub-functions?</div></div> </blockquote></div><br></div><div class=3D"gmail_extra">*bump*</div><div cl= ass=3D"gmail_extra">I actually care about this a whole lot more than final-= by-default right now ;)</div><div class=3D"gmail_extra"><br></div><div clas= s=3D"gmail_extra"> I&#39;d like to think there&#39;s a possible solution to these problems tha= t everyone agrees with.</div></div> --089e013cb9463fc01204f4cd5e3c--
Mar 17 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--089e013cb946ea393804f4cfa75f
Content-Type: text/plain; charset=UTF-8

On 18 March 2014 01:36, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 I'd like to see a solution to inlining along the lines of "pliz pliz
 inline" (best effort) and "never inline".

 Outlining only at a specific call site is seldom needed and when it is
 it's trivially achievable with a noinline function forwarding to the inline
 function. Inlining only at a specific call site is a tall order and
 essentially impossible if header generation had been used.

I don't follow, how does that work? It's the key innovation here. Since D doesn't have macros, I think it's something that really needs to be supported nicely. Obviously it's impossible if source is unavailable. It should give the same complaints that CTFE gives when source is unavailable. --089e013cb946ea393804f4cfa75f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 8 March 2014 01:36, Andrei Alexandrescu <span dir=3D"ltr">&lt;<a href=3D"ma= ilto:SeeWebsiteForEmail erdani.org" target=3D"_blank">SeeWebsiteForEmail er= dani.org</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><br> I&#39;d like to see a solution to inlining along the lines of &quot;pliz pl= iz inline&quot; (best effort) and &quot;never inline&quot;.<br> <br> Outlining only at a specific call site is seldom needed and when it is it&#= 39;s trivially achievable with a noinline function forwarding to the inline= function. Inlining only at a specific call site is a tall order and essent= ially impossible if header generation had been used.</blockquote> <div><br></div><div>I don&#39;t follow, how does that work?</div><div>It&#3= 9;s the key innovation here. Since D doesn&#39;t have macros, I think it&#3= 9;s something that really needs to be supported nicely.</div><div>Obviously= it&#39;s impossible if source is unavailable. It should give the same comp= laints that CTFE gives when source is unavailable.</div> </div></div></div> --089e013cb946ea393804f4cfa75f--
Mar 17 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--001a11333b200ce32204f4e0515a
Content-Type: text/plain; charset=UTF-8

On 18 March 2014 06:37, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 On 3/17/14, 9:10 AM, Manu wrote:

 On 18 March 2014 01:36, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>

 wrote:


     I'd like to see a solution to inlining along the lines of "pliz pliz
     inline" (best effort) and "never inline".

     Outlining only at a specific call site is seldom needed and when it
     is it's trivially achievable with a noinline function forwarding to
     the inline function. Inlining only at a specific call site is a tall
     order and essentially impossible if header generation had been used.


 I don't follow, how does that work?
 It's the key innovation here. Since D doesn't have macros, I think it's
 something that really needs to be supported nicely.
 Obviously it's impossible if source is unavailable. It should give the
 same complaints that CTFE gives when source is unavailable.

The notion that a compiler can ask for any function to be inlined without the compiler having been "warned" in the function declaration makes me uncomfortable about feasibility. However, upon further thinking the same happens with CTFE.

Exactly, we already have it in CTFE. It doesn't really add any new concept that D isn't already comfortable with. It can kinda be seen as sort of a type safe macro, which is a tool that D is lacking compared to C. I think the mixin keyword and concept makes perfect sense in this context. It feels quite intuitive to me. --001a11333b200ce32204f4e0515a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 8 March 2014 06:37, Andrei Alexandrescu <span dir=3D"ltr">&lt;<a href=3D"ma= ilto:SeeWebsiteForEmail erdani.org" target=3D"_blank">SeeWebsiteForEmail er= dani.org</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D"">On 3/17/14, 9:10 AM, Manu wr= ote:<br> </div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"><div class=3D""> On 18 March 2014 01:36, Andrei Alexandrescu<br></div> &lt;<a href=3D"mailto:SeeWebsiteForEmail erdani.org" target=3D"_blank">SeeW= ebsiteForEmail erdani.org</a> &lt;mailto:<a href=3D"mailto:SeeWebsiteForEma= il erdani.org" target=3D"_blank">SeeWebsiteForEmail <u></u>erdani.org</a>&g= t;&gt;<div class=3D""> <br> wrote:<br> <br> <br> =C2=A0 =C2=A0 I&#39;d like to see a solution to inlining along the lines of= &quot;pliz pliz<br> =C2=A0 =C2=A0 inline&quot; (best effort) and &quot;never inline&quot;.<br> <br> =C2=A0 =C2=A0 Outlining only at a specific call site is seldom needed and w= hen it<br> =C2=A0 =C2=A0 is it&#39;s trivially achievable with a noinline function for= warding to<br> =C2=A0 =C2=A0 the inline function. Inlining only at a specific call site is= a tall<br> =C2=A0 =C2=A0 order and essentially impossible if header generation had bee= n used.<br> <br> <br></div><div class=3D""> I don&#39;t follow, how does that work?<br> It&#39;s the key innovation here. Since D doesn&#39;t have macros, I think = it&#39;s<br> something that really needs to be supported nicely.<br> Obviously it&#39;s impossible if source is unavailable. It should give the<= br> same complaints that CTFE gives when source is unavailable.<br> </div></blockquote> <br> The notion that a compiler can ask for any function to be inlined without t= he compiler having been &quot;warned&quot; in the function declaration make= s me uncomfortable about feasibility.<br> <br> However, upon further thinking the same happens with CTFE.</blockquote><div=
<br></div><div>Exactly, we already have it in CTFE. It doesn&#39;t really =

It can kinda be seen as sort of a type safe macro, which is a tool that D i= s lacking compared to C. I think the mixin keyword and concept makes perfec= t sense in this context. It feels quite intuitive to me.</div></div></div> </div> --001a11333b200ce32204f4e0515a--
Mar 18 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Saturday, 15 March 2014 at 04:17:06 UTC, Manu wrote:
 I'd say it should inline only func. Any sub-calls are subject 
 to the
 regular inline heuristics.

I agree with you that explicit inlining is absolutely necessary and that call site inlining is highly desirable. However, I think that the call-site inlining should inline as much as possible. Basically this is something you will try when the code is too slow to meet real time deadlines and you hope to avoid going for a hand optimized solution in order to cut down on dev time. That suggests aggressive inlining to me. If the inlining only goes one level then I don't think this will be used frequently enough to be useful, e.g. you can just create one inline version and then a non-inline version that calls the inline version. E.g.: noninline_func(){ inline_func();} Ola.
Mar 18 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--089e013cb94696eb9704f4eb8f41
Content-Type: text/plain; charset=UTF-8

On 18 March 2014 23:11,
<7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 On Saturday, 15 March 2014 at 04:17:06 UTC, Manu wrote:

 I'd say it should inline only func. Any sub-calls are subject to the
 regular inline heuristics.

I agree with you that explicit inlining is absolutely necessary and that call site inlining is highly desirable. However, I think that the call-site inlining should inline as much as possible. Basically this is something you will try when the code is too slow to meet real time deadlines and you hope to avoid going for a hand optimized solution in order to cut down on dev time. That suggests aggressive inlining to me.

Inlining is a basic codegen tool, and it's important that low-level programmers have tight control over this aspect of the compiler's codegen. I think it's a mistake to consider it an optimisation, unless you know precisely what you're doing. I wouldn't want to see it try and forcibly inline the whole tree; there's no reason to believe that the whole tree should be inlined 100% of the time, rather, it's almost certainly not the case. In the case you do want to inline the whole tree, you can just cascade the mixin through the stack. In the case you suggest which flattens the tree by default, we've lost control; how to tell it only to do it for one level without hacks? And I believe this is the common case. For example, It's very likely that you might require a function to inline that is relatively trivial in its own right; a wrapper or a macro effectively, but conditionally calls an expensive function, or perhaps calls a function that you don't have source for (it would break at that point if it tried to inline the tree). If the inlining only goes one level then I don't think this will be used
 frequently enough to be useful, e.g. you can just create one inline version
 and then a non-inline version that calls the inline version.

As the one that requested it, I have numerous uses for it to mixin just the one level. I can't imagine any uses where I would ever want to explicitly inline the whole tree, and not be happy to cascade it manually. E.g.:
 noninline_func(){ inline_func();}

Why? This is really overcomplicating a simple thing. And I'm not quite sure what you're suggesting this should do either. Are you saying the call tree is flattened behind this proxy non-inline function? I don't think that's useful. I don't think anything would/should be marked __alwaysinline unless you REALLY mean that it has literally no business being called. Ie, marking something __alwaysinline just for the sake of wrapping it with a non-inline is the wrong thing to do. Just to reiterate, inline is a tool, not an 'optimisation'. It doesn't necessarily yield faster code, in many situations it is slower, and best left to the compiler to decide. But it's an important tool for any low-level programmer to have. D must provide a sufficient suite of low-level tools that allow proper control over the code generation. I think as a tool, it should be deliberate and conservative in approach, ie, just one level, and let the programmer cascade it if that's what they mean to do. There should be no surprises with something like this, and if it's inlining a whole call tree, you often don't know what happens further down the tree, and it's more likely to change on you unexpectedly. --089e013cb94696eb9704f4eb8f41 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 8 March 2014 23:11, <span dir=3D"ltr">&lt;<a href=3D"mailto:7d89a89974b0ff= 40.invalid internationalized.invalid" target=3D"_blank">7d89a89974b0ff40.in= valid internationalized.invalid</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"><div class=3D"">On Saturday, 15 March 2014 at 04:17:06 UTC= , Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> I&#39;d say it should inline only func. Any sub-calls are subject to the<br=

</blockquote> <br></div> I agree with you that explicit inlining is absolutely necessary and that ca= ll site inlining is highly desirable. However, I think that the call-site i= nlining should inline as much as possible. Basically this is something you = will try when the code is too slow to meet real time deadlines and you hope= to avoid going for a hand optimized solution in order to cut down on dev t= ime. That suggests aggressive inlining to me.<br> </blockquote><div><br></div><div>Inlining is a basic codegen tool, and it&#= 39;s important that low-level programmers have tight control over this aspe= ct of the compiler&#39;s codegen. I think it&#39;s a mistake to consider it= an optimisation, unless you know precisely what you&#39;re doing.</div> <div>I wouldn&#39;t want to see it try and forcibly inline the whole tree; = there&#39;s no reason to believe that the whole tree should be inlined 100%= of the time, rather, it&#39;s almost certainly not the case.</div><div> In the case you do want to inline the whole tree, you can just cascade the = mixin through the stack. In the case you suggest which flattens the tree by= default, we&#39;ve lost control; how to tell it only to do it for one leve= l without hacks? And I believe this is the common case.</div> <div><br></div><div>For example, It&#39;s very likely that you might requir= e a function to inline that is relatively trivial in its own right; a wrapp= er or a macro effectively, but conditionally calls an expensive function, o= r perhaps calls a function that you don&#39;t have source for (it would bre= ak at that point if it tried to inline the tree).</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p= x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef= t-style:solid;padding-left:1ex"> If the inlining only goes one level then I don&#39;t think this will be use= d frequently enough to be useful, e.g. you can just create one inline versi= on and then a non-inline version that calls the inline version.<br></blockq= uote> <div><br></div><div>As the one that requested it, I have numerous uses for = it to mixin just the one level. I can&#39;t imagine any uses where I would = ever want to explicitly inline the whole tree, and not be happy to cascade = it manually.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p= x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef= t-style:solid;padding-left:1ex"> E.g.:<br> <br> noninline_func(){ inline_func();}<br></blockquote><div><br></div><div>Why? = This is really overcomplicating a simple thing. And I&#39;m not quite sure = what you&#39;re suggesting this should do either. Are you saying the call t= ree is flattened behind this proxy non-inline function? I don&#39;t think t= hat&#39;s useful.</div> <div>I don&#39;t think anything would/should be marked __alwaysinline unles= s you REALLY mean that it has literally no business being called. Ie, marki= ng something __alwaysinline just for the sake of wrapping it with a non-inl= ine is the wrong thing to do.<br> </div><div><br></div><div>Just to reiterate, inline is a tool, not an &#39;= optimisation&#39;. It doesn&#39;t necessarily yield faster code, in many si= tuations it is slower, and best left to the compiler to decide. But it&#39;= s an important tool for any low-level programmer to have. D must provide a = sufficient suite of low-level tools that allow proper control over the code= generation. I think as a tool, it should be deliberate and conservative in= approach, ie, just one level, and let the programmer cascade it if that&#3= 9;s what they mean to do. There should be no surprises with something like = this, and if it&#39;s inlining a whole call tree, you often don&#39;t know = what happens further down the tree, and it&#39;s more likely to change on y= ou unexpectedly.</div> </div></div></div> --089e013cb94696eb9704f4eb8f41--
Mar 18 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 19 March 2014 at 01:28:48 UTC, Manu wrote:
 In the case you do want to inline the whole tree, you can just 
 cascade the
 mixin through the stack. In the case you suggest which flattens 
 the tree by
 default, we've lost control; how to tell it only to do it for 
 one level
 without hacks? And I believe this is the common case.

You could provide it with a recursion level parameter or parameters for cost level heuristics. It could also be used to flatten tail-call recursion.
 As the one that requested it, I have numerous uses for it to 
 mixin just the
 one level. I can't imagine any uses where I would ever want to 
 explicitly
 inline the whole tree, and not be happy to cascade it manually.

In innerloops to factor out common subexpressions that are otherwise recomputed over and over and over. When the function is generated code (not hand written).
 noninline_func(){ inline_func();}

Why? This is really overcomplicating a simple thing. And I'm not quite sure what you're suggesting this should do either. Are you saying the call tree is flattened behind this proxy non-inline function?

No, I am saying that the one level mixin doesn't provide you with anything new. You already have that. It is sugar.
Mar 18 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--bcaec54b5366b50c5c04f4f18792
Content-Type: text/plain; charset=UTF-8

On 19 March 2014 16:18,
<7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 On Wednesday, 19 March 2014 at 01:28:48 UTC, Manu wrote:

 In the case you do want to inline the whole tree, you can just cascade the
 mixin through the stack. In the case you suggest which flattens the tree
 by
 default, we've lost control; how to tell it only to do it for one level
 without hacks? And I believe this is the common case.

You could provide it with a recursion level parameter or parameters for cost level heuristics.

Again, I think this is significantly overcomplicating something which see is being extremely simple. It could also be used to flatten tail-call recursion. I don't think it's valid to inline a tail call recursion, because the inlined call also wants to inline another call to itself... You can't know how fer it should go, so it needs to be transformed into a loop, and not we're talking about something completely different than inlining. As the one that requested it, I have numerous uses for it to mixin just the
 one level. I can't imagine any uses where I would ever want to explicitly
 inline the whole tree, and not be happy to cascade it manually.

In innerloops to factor out common subexpressions that are otherwise recomputed over and over and over.

This is highly context sensitive. I would trust the compiler heuristics to make the right decision here. The idea of eliminating common sub-expressions suggests that there _are_ common sub-expressions, which aren't affected by the function arguments. This case is highly unusual in my experience. And I personally wouldn't depend on a feature such as this to address that sort of a problem in my codegen. I would just refactor the function a little bit to call the common sub-expression ahead of time. When the function is generated code (not hand written). I'm not sue what you mean here? noninline_func(){ inline_func();}

sure what you're suggesting this should do either. Are you saying the call tree is flattened behind this proxy non-inline function?

No, I am saying that the one level mixin doesn't provide you with anything new.

It really does provide something new. It provides effectively, a type-safe implementation of something that may be used in place of C/C++ macros. I think that would be extremely useful in a variety of applications. You already have that. It is sugar.

I don't already have it, otherwise I'd be making use of it. D has no control over the inliner. GDC/LDC offer attributes, but then it's really annoying that D has no mechanism to make use of compiler-specific attributes in a portable way (ie, attribute aliasing), so I can't make use of those without significantly interfering with my code. I also don't think that suggestion of yours works. I suspect the compiler will see the outer function as a trivial wrapper which will fall within the compilers normal inline heuristics, and it will all inline anyway. --bcaec54b5366b50c5c04f4f18792 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 9 March 2014 16:18, <span dir=3D"ltr">&lt;<a href=3D"mailto:7d89a89974b0ff= 40.invalid internationalized.invalid" target=3D"_blank">7d89a89974b0ff40.in= valid internationalized.invalid</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"><div class=3D"">On Wednesday, 19 March 2014 at 01:28:48 UT= C, Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> In the case you do want to inline the whole tree, you can just cascade the<= br> mixin through the stack. In the case you suggest which flattens the tree by= <br> default, we&#39;ve lost control; how to tell it only to do it for one level= <br> without hacks? And I believe this is the common case.<br> </blockquote> <br></div> You could provide it with a recursion level parameter or parameters for cos= t level heuristics.<br></blockquote><div><br></div><div>Again, I think this= is significantly overcomplicating something which see is being extremely s= imple.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p= x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef= t-style:solid;padding-left:1ex"> It could also be used to flatten tail-call recursion.</blockquote><div><br>= </div><div>I don&#39;t think it&#39;s valid to inline a tail call recursion= , because the inlined call also wants to inline another call to itself...</= div> <div>You can&#39;t know how fer it should go, so it needs to be transformed= into a loop, and not we&#39;re talking about something completely differen= t than inlining.</div><div><br></div><blockquote class=3D"gmail_quote" styl= e=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(2= 04,204,204);border-left-style:solid;padding-left:1ex"> <div class=3D""> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> As the one that requested it, I have numerous uses for it to mixin just the= <br> one level. I can&#39;t imagine any uses where I would ever want to explicit= ly<br> inline the whole tree, and not be happy to cascade it manually.<br> </blockquote> <br></div> In innerloops to factor out common subexpressions that are otherwise recomp= uted over and over and over.<br></blockquote><div><br></div><div>This is hi= ghly context sensitive. I would trust the compiler heuristics to make the r= ight decision here.</div> <div>The idea of eliminating common sub-expressions suggests that there _ar= e_ common sub-expressions, which aren&#39;t affected by the function argume= nts.</div><div>This case is highly unusual in my experience. And I personal= ly wouldn&#39;t depend on a feature such as this to address that sort of a = problem in my codegen. I would just refactor the function a little bit to c= all the common sub-expression ahead of time.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p= x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef= t-style:solid;padding-left:1ex"> When the function is generated code (not hand written).</blockquote><div><b= r></div><div>I&#39;m not sue what you mean here?</div><div><br></div><block= quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-w= idth:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding= -left:1ex"> <div class=3D""> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px = 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-l= eft-style:solid;padding-left:1ex"> noninline_func(){ inline_func();}<br> <br> </blockquote> <br> Why? This is really overcomplicating a simple thing. And I&#39;m not quite = sure<br> what you&#39;re suggesting this should do either. Are you saying the call t= ree<br> is flattened behind this proxy non-inline function?<br> </blockquote> <br></div> No, I am saying that the one level mixin doesn&#39;t provide you with anyth= ing new.</blockquote><div><br></div><div class=3D"gmail_extra">It really do= es provide something new. It provides effectively, a type-safe implementati= on of something that may be used in place of C/C++ macros. I think that wou= ld be extremely useful in a variety of applications.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p= x 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-lef= t-style:solid;padding-left:1ex">You already have that. It is sugar.<br> </blockquote></div></div><div class=3D"gmail_extra"><br></div><div class=3D= "gmail_extra">I don&#39;t already have it, otherwise I&#39;d be making use = of it. D has no control over the inliner. GDC/LDC offer attributes, but the= n it&#39;s really annoying that D has no mechanism to make use of compiler-= specific attributes in a portable way (ie, attribute aliasing), so I can&#3= 9;t make use of those without significantly interfering with my code.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">I also don&= #39;t think that suggestion of yours works. I suspect the compiler will see= the outer function as a trivial wrapper which will fall within the compile= rs normal inline heuristics, and it will all inline anyway.</div> </div> --bcaec54b5366b50c5c04f4f18792--
Mar 19 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 19 March 2014 at 08:35:53 UTC, Manu wrote:
 The idea of eliminating common sub-expressions suggests that 
 there _are_
 common sub-expressions, which aren't affected by the function 
 arguments.
 This case is highly unusual in my experience.

Not if you delay optimization until profiling and focus on higher level structures during initial implementation. Or use composing (like generic programming). If you hand optimize right from the start then you might be right, but if you never call a function with the same parameters then you are doing premature optimization IMHO.
 When the function is generated code (not hand written).

I'm not sue what you mean here?

Code that is generated by a tool (or composable templates or whatever) tend to be repetitive and suboptimal. I.e. boiler plate code that looks like it was written by a monkey…
 You already have that. It is sugar.

I don't already have it, otherwise I'd be making use of it. D has no control over the inliner.

I meant that if you have explicit inline hints like C++ then you also have call-site inlining if you want to.
 I also don't think that suggestion of yours works. I suspect 
 the compiler
 will see the outer function as a trivial wrapper which will 
 fall within the
 compilers normal inline heuristics, and it will all inline 
 anyway.

That should be considered a bug if it is called from more than one location.
Mar 19 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--089e013d052cd99b4504f4f4e07c
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On 19 March 2014 19:16,
<7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 On Wednesday, 19 March 2014 at 08:35:53 UTC, Manu wrote:

 The idea of eliminating common sub-expressions suggests that there _are_
 common sub-expressions, which aren't affected by the function arguments.
 This case is highly unusual in my experience.

Not if you delay optimization until profiling and focus on higher level structures during initial implementation. Or use composing (like generic programming). If you hand optimize right from the start then you might be right, but if you never call a function with the same parameters then you are doing premature optimization IMHO.

Okay, do you have use cases for any of this stuff? Are you just making it up, or do you have significant experience to say this is what you need? I can say for a fact, that recursive inline would make almost everything I want it for much more annoying. I would find myself doing stupid stuff to fight the recursive inliner in every instance. When the function is generated code (not hand written).

I'm not sue what you mean here?

Code that is generated by a tool (or composable templates or whatever) tend to be repetitive and suboptimal. I.e. boiler plate code that looks like it was written by a monkey=E2=80=A6

I'm not sure where the value is... why would you want to inline this? You already have that. It is sugar.

control over the inliner.

I meant that if you have explicit inline hints like C++ then you also hav=

 call-site inlining if you want to.

I still don't follow. C++ doesn't have call-site inlining. C/C++ has macros, and there is no way to achieve the same functionality in D right now, that's a key motivation for the proposal. I also don't think that suggestion of yours works. I suspect the compiler
 will see the outer function as a trivial wrapper which will fall within
 the
 compilers normal inline heuristics, and it will all inline anyway.

That should be considered a bug if it is called from more than one location.

Seriously, you're making 'inline' about 10 times more complicated than it should ever be. If you ask me, I have no value in recursive inlining, infact, that would anger me considerably. By making this hard, you're also making it equally unlikely. Let inline exist first, then if/when it doesn't suit your use cases, argue for the details. --089e013d052cd99b4504f4f4e07c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 1= 9 March 2014 19:16, <span dir=3D"ltr">&lt;<a href=3D"mailto:7d89a89974b0ff= 40.invalid internationalized.invalid" target=3D"_blank">7d89a89974b0ff40.in= valid internationalized.invalid</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D"">On Wednesday, 19 March 2014 = at 08:35:53 UTC, Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> The idea of eliminating common sub-expressions suggests that there _are_<br=

.<br> This case is highly unusual in my experience.<br> </blockquote> <br></div> Not if you delay optimization until profiling and focus on higher level str= uctures during initial implementation. Or use composing (like generic progr= amming).<br> <br> If you hand optimize right from the start then you might be right, but if y= ou never call a function with the same parameters then you are doing premat= ure optimization IMHO.<br></blockquote><div><br></div><div>Okay, do you hav= e use cases for any of this stuff? Are you just making it up, or do you hav= e significant experience to say this is what you need?</div> <div>I can say for a fact, that recursive inline would make almost everythi= ng I want it for much more annoying. I would find myself doing stupid stuff= to fight the recursive inliner in every instance.</div><div><br></div> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D""><blockquote class=3D"gmail_q= uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e= x"> When the function is generated code (not hand written).<br> </blockquote> <br></div><div class=3D""> I&#39;m not sue what you mean here?<br> </div></blockquote> <br> Code that is generated by a tool (or composable templates or whatever) tend= to be repetitive and suboptimal. I.e. boiler plate code that looks like it= was written by a monkey=E2=80=A6<br></blockquote><div><br></div><div>I&#39= ;m not sure where the value is... why would you want to inline this?</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex"> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D""><blockquote class=3D"gmail_q= uote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1e= x"> You already have that. It is sugar.<br> <br> </blockquote> <br></div><div class=3D""> I don&#39;t already have it, otherwise I&#39;d be making use of it. D has n= o control over the inliner.<br> </div></blockquote> <br> I meant that if you have explicit inline hints like C++ then you also have = call-site inlining if you want to.</blockquote><div><br></div><div>I still = don&#39;t follow. C++ doesn&#39;t have call-site inlining. C/C++ has macros= , and there is no way to achieve the same functionality in D right now, tha= t&#39;s a key motivation for the proposal.</div> <div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex= ;border-left:1px #ccc solid;padding-left:1ex"><div class=3D""> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> I also don&#39;t think that suggestion of yours works. I suspect the compil= er<br> will see the outer function as a trivial wrapper which will fall within the= <br> compilers normal inline heuristics, and it will all inline anyway.<br> </blockquote> <br></div> That should be considered a bug if it is called from more than one location= .<br> </blockquote></div><br></div><div class=3D"gmail_extra">Seriously, you&#39;= re making &#39;inline&#39; about 10 times more complicated than it should e= ver be.</div><div class=3D"gmail_extra">If you ask me, I have no value in r= ecursive inlining, infact, that would anger me considerably.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">By making t= his hard, you&#39;re also making it equally unlikely. Let inline exist firs= t, then if/when it doesn&#39;t suit your use cases, argue for the details.<= /div> </div> --089e013d052cd99b4504f4f4e07c--
Mar 19 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 19 March 2014 at 12:35:30 UTC, Manu wrote:
 Okay, do you have use cases for any of this stuff? Are you just 
 making it
 up, or do you have significant experience to say this is what 
 you need?

I don't need anything, I hand optimize prematurely. And I don't want to do that. But yes, inner loops benefits from exhaustive inlining because you get to move common expressions out of the loop or change them to delta increments. It is only when you trash the caches that inlining does not pay off. I do it by hand. I don't want to do it by hand.
 If you ask me, I have no value in recursive inlining, infact, 
 that would
 anger me considerably.

Why? You could always set the depth to 1, or make 1 the default. And it isn't difficult to implement.
Mar 19 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--f46d044517578347e304f5003bfc
Content-Type: text/plain; charset=UTF-8

On 20 March 2014 06:23,
<7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 On Wednesday, 19 March 2014 at 12:35:30 UTC, Manu wrote:

 Okay, do you have use cases for any of this stuff? Are you just making it
 up, or do you have significant experience to say this is what you need?

I don't need anything, I hand optimize prematurely. And I don't want to do that. But yes, inner loops benefits from exhaustive inlining because you get to move common expressions out of the loop or change them to delta increments. It is only when you trash the caches that inlining does not pay off. I do it by hand. I don't want to do it by hand. If you ask me, I have no value in recursive inlining, infact, that would
 anger me considerably.

Why? You could always set the depth to 1, or make 1 the default. And it isn't difficult to implement.

The problem is upside down. If you want to inline multiple levels, you start from the leaves and move downwards, not from the root moving upwards (leaving a bunch of leaves perhaps not inlined), which is what you're really suggesting. Inlining should be strictly deliberate, there's nothing to say that every function called in a tree should be inlined. There's a high probability there's one/some that shouldn't be among a few that should. Remember too, that call-site inlining isn't the only method, there would also be always-inline. I think always-inline is what you want for some decidedly trivial functions (although these will probably be heuristically inlined anyway), not call-site inlining. I just don't see how recursive call-site inlining is appropriate, considering that call trees are often complex, subject to change, and may even call functions that you don't have source for. You can cascade the mixin keyword if you want to, that's very simple. I'd be highly surprised if you ever encountered a call tree where you wanted to inline everything (and the optimiser didn't do it for you). As soon as you encounter a single function in the tree that shouldn't be inlined, then you'll be forced to do it one level at a time anyway. --f46d044517578347e304f5003bfc Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 2= 0 March 2014 06:23, <span dir=3D"ltr">&lt;<a href=3D"mailto:7d89a89974b0ff= 40.invalid internationalized.invalid" target=3D"_blank">7d89a89974b0ff40.in= valid internationalized.invalid</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"><div class=3D"">On Wednesday, 19 March 2014 = at 12:35:30 UTC, Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> Okay, do you have use cases for any of this stuff? Are you just making it<b= r> up, or do you have significant experience to say this is what you need?<br> </blockquote> <br></div> I don&#39;t need anything, I hand optimize prematurely. And I don&#39;t wan= t to do that.<br> <br> But yes, inner loops benefits from exhaustive inlining because you get to m= ove common expressions out of the loop or change them to delta increments. = It is only when you trash the caches that inlining does not pay off.<br> <br> I do it by hand. I don&#39;t want to do it by hand.<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> If you ask me, I have no value in recursive inlining, infact, that would<br=

</blockquote> <br></div> Why? You could always set the depth to 1, or make 1 the default.<br> <br> And it isn&#39;t difficult to implement.<br> </blockquote></div><br></div><div class=3D"gmail_extra">The problem is upsi= de down. If you want to inline multiple levels, you start from the leaves a= nd move downwards, not from the root moving upwards (leaving a bunch of lea= ves perhaps not inlined), which is what you&#39;re really suggesting.</div> <div class=3D"gmail_extra">Inlining should be strictly deliberate, there&#3= 9;s nothing to say that every function called in a tree should be inlined. = There&#39;s a high probability there&#39;s one/some that shouldn&#39;t be a= mong a few that should.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Remember to= o, that call-site inlining isn&#39;t the only method, there would also be a= lways-inline. I think always-inline is what you want for some decidedly tri= vial functions (although these will probably be heuristically inlined anywa= y), not call-site inlining. I just don&#39;t see how recursive call-site in= lining is appropriate, considering that call trees are often complex, subje= ct to change, and may even call functions that you don&#39;t have source fo= r. You can cascade the mixin keyword if you want to, that&#39;s very simple= . I&#39;d be highly surprised if you ever encountered a call tree where you= wanted to inline everything (and the optimiser didn&#39;t do it for you). = As soon as you encounter a single function in the tree that shouldn&#39;t b= e inlined, then you&#39;ll be forced to do it one level at a time anyway.</= div> </div> --f46d044517578347e304f5003bfc--
Mar 19 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 20 March 2014 at 02:08:16 UTC, Manu wrote:
 The problem is upside down. If you want to inline multiple 
 levels, you
 start from the leaves and move downwards, not from the root 
 moving upwards

Yes, that is true in cases where leaves are frequently visited. Good point. I am most interested in full inlining, but the heuristics should probably start with the leaves for people not interested in that. Agree. Anyway, in the case of ray tracing (or any search structure) I could see the value of having the opposite in combination with CTFE/partial evaluation. Example: Define a static scene (of objects) and let the compiler turn it into "a state machine" of code. Another example: Define an array of data, use partial evaluation to turn it into a binary tree, then turn the binary tree into code.
 Inlining should be strictly deliberate, there's nothing to say 
 that every
 function called in a tree should be inlined. There's a high 
 probability
 there's one/some that shouldn't be among a few that should.

In the case of a long running loop it does not really matter. What it does get you is a chance to use generic code (or libraries) and then do a first-resort optimization. I basically see it as a time-saving feature (programmers time). A tool for cutting development costs.
 Remember too, that call-site inlining isn't the only method, 
 there would
 also be always-inline...

Yes, that is the first. I have in another thread some time ago suggested a solution that use weighted inlining to aid compiler heuristics: http://forum.dlang.org/thread/szjkyfpnachnnyknnfwp forum.dlang.org#post-szjkyfpnachnnyknnfwp:40forum.dlang.org As you can see I also suggested call-site inlining, so I am fully behind you in this. :-) Lack of inlining and GC are my main objections to D.
 I think always-inline is what you want for some
 decidedly trivial functions (although these will probably be 
 heuristically
 inlined anyway), not call-site inlining.

I agree. Compiler heuristics can change. It is desirable to be able to express intent no matter what the current heuristics are.
 I just don't see how recursive
 call-site inlining is appropriate, considering that call trees 
 are often
 complex, subject to change, and may even call functions that 
 you don't have
 source for.

You should not use it blindly.
 You can cascade the mixin keyword if you want to, that's very 
 simple.

Not if you build the innerloop using generic components. I want this inline_everything while(conditon){ statement; statement; }
 I'd be highly surprised if you ever encountered a call tree 
 where
 you wanted to inline everything (and the optimiser didn't do it 
 for you).

Not if you move to high-level programming using prewritten code and only go low level after profiling.
 As soon as you encounter a single function in the tree that 
 shouldn't be
 inlined, then you'll be forced to do it one level at a time 
 anyway.

But then you have to change the libraries you are using!? Nothing prevents you to introduce exceptions as an extension though. I want inline(0.5) as default, but also be able to write inline(1) for inline always and inline(0) for inline never. func1(){} // implies inline(0.5) weighting inline func2(){} // same as inline(1) weighting, inline always inline(0.75) fun31(){} // increase the heuristics weighting inline(0) func4(){} // never-ever inline Ola.
Mar 20 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
I just want to add these reasons for having inlining despite 
having compiler heuristics:

1. If you compile for embedded or PNACL on the web, you want a 
small executable. That means the heuristics should not inline if 
it increase the code size unless the programmer specified it in 
the code. (Or that you specify a target size, and do compiler 
re-runs until it fits.)

2. If you use profile guided opimization you should inline based 
on call frequency, but the input set might have missed some 
scenarios and you should be able to overrule the profile by 
explicit inlining in code where you know that it matters. (e.g. 
tight loop in an exception handler)
Mar 20 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--f46d04451757702de104f508f114
Content-Type: text/plain; charset=UTF-8

On 20 March 2014 18:35,
<7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 On Thursday, 20 March 2014 at 02:08:16 UTC, Manu wrote:

 The problem is upside down. If you want to inline multiple levels, you
 start from the leaves and move downwards, not from the root moving upwards

Yes, that is true in cases where leaves are frequently visited. Good point. I am most interested in full inlining, but the heuristics should probably start with the leaves for people not interested in that. Agree. Anyway, in the case of ray tracing (or any search structure) I could see the value of having the opposite in combination with CTFE/partial evaluation. Example: Define a static scene (of objects) and let the compiler turn it into "a state machine" of code. Another example: Define an array of data, use partial evaluation to turn it into a binary tree, then turn the binary tree into code. Inlining should be strictly deliberate, there's nothing to say that every
 function called in a tree should be inlined. There's a high probability
 there's one/some that shouldn't be among a few that should.

In the case of a long running loop it does not really matter. What it does get you is a chance to use generic code (or libraries) and then do a first-resort optimization. I basically see it as a time-saving feature (programmers time). A tool for cutting development costs. Remember too, that call-site inlining isn't the only method, there would
 also be always-inline...

Yes, that is the first. I have in another thread some time ago suggested a solution that use weighted inlining to aid compiler heuristics: http://forum.dlang.org/thread/szjkyfpnachnnyknnfwp forum.dlang.org#post- szjkyfpnachnnyknnfwp:40forum.dlang.org As you can see I also suggested call-site inlining, so I am fully behind you in this. :-) Lack of inlining and GC are my main objections to D. I think always-inline is what you want for some
 decidedly trivial functions (although these will probably be heuristically
 inlined anyway), not call-site inlining.

I agree. Compiler heuristics can change. It is desirable to be able to express intent no matter what the current heuristics are. I just don't see how recursive
 call-site inlining is appropriate, considering that call trees are often
 complex, subject to change, and may even call functions that you don't
 have
 source for.

You should not use it blindly. You can cascade the mixin keyword if you want to, that's very simple.

Not if you build the innerloop using generic components. I want this inline_everything while(conditon){ statement; statement; } I'd be highly surprised if you ever encountered a call tree where
 you wanted to inline everything (and the optimiser didn't do it for you).

Not if you move to high-level programming using prewritten code and only go low level after profiling. As soon as you encounter a single function in the tree that shouldn't be
 inlined, then you'll be forced to do it one level at a time anyway.

But then you have to change the libraries you are using!? Nothing prevents you to introduce exceptions as an extension though. I want inline(0.5) as default, but also be able to write inline(1) for inline always and inline(0) for inline never. func1(){} // implies inline(0.5) weighting inline func2(){} // same as inline(1) weighting, inline always inline(0.75) fun31(){} // increase the heuristics weighting inline(0) func4(){} // never-ever inline Ola.

I'm sorry. I really can't support any of these wildly complex ideas. I just don't feel they're useful, and they're not very well founded. A numeric weight? What scale is it in? I'm not sure of any 'standard-inline-weight-measure' that any programmer would be able to intuitively gauge the magic number against. That will simply never be agreed by the devs. It also doesn't make much sense... different platforms will assign very different weights and different heuristics at the inliner. It's not a numeric quantity; it's a complex determination whether a function is a good candidate or not. The value you specify is likely highly context sensitive and probably not portable. Heuristic based Inlining should be left to the optimiser to decide. And I totally object to recursive inlining. It has a kind of absolute nature that removes control all the way down the call tree, and I don't feel it's likely that you would often (ever?) want to explicitly inline an entire call tree. If you want to inline a second level, then write mixin in the second level. Recurse. You are talking about generic code as if this isn't appropriate, but I specifically intend to use this in generic code very similar to what you suggest; so I don't see the incompatibility. I think you're saying like manually specifying it all the way down the call tree is inconvenient, but I would argue that manually specifying *exclusions* throughout the call tree after specifying a recursive inline is even more inconvenient. It requires more language (a feature to mark an exclusion), has a kind of obtuse double-negative logic about it, and it's equally invasive to your code. If you can prove that single level call-site inlining doesn't satisfy your needs at some later time, make a proposal then, along with your real-world use cases. But by throwing it in this thread right now, you're kinda just killing the thread, and making it very unlikely that anything will happen at all, which is annoying, because I REALLY need this (I've been trying to motivate inline support for over 3 years), and I get the feeling you're just throwing hypotheticals around. You're still fairly new here, but be aware that feature requests will become exponentially less likely to be accepted with every degree of complexity added. By making this seem hard, you're also making it almost certain not to happen, which isn't in either of our interest. My OP suggestion is the simplest solution I can conceive which will definitely satisfy all the real-world use cases that I've ever encountered. Is predictable, portable, simple. --f46d04451757702de104f508f114 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 2= 0 March 2014 18:35, <span dir=3D"ltr">&lt;<a href=3D"mailto:7d89a89974b0ff= 40.invalid internationalized.invalid" target=3D"_blank">7d89a89974b0ff40.in= valid internationalized.invalid</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"><div class=3D"">On Thursday, 20 March 2014 at 02:08:16 UTC= , Manu wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> The problem is upside down. If you want to inline multiple levels, you<br> start from the leaves and move downwards, not from the root moving upwards<= br> </blockquote> <br></div> Yes, that is true in cases where leaves are frequently visited. Good point.= I am most interested in full inlining, but the heuristics should probably = start with the leaves for people not interested in that. Agree.<br> <br> Anyway, in the case of ray tracing (or any search structure) I could see th= e value of having the opposite in combination with CTFE/partial evaluation.= <br> <br> Example: Define a static scene (of objects) and let the compiler turn it in= to &quot;a state machine&quot; of code.<br> <br> Another example: Define an array of data, use partial evaluation to turn it= into a binary tree, then turn the binary tree into code.<div class=3D""><b= r> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> Inlining should be strictly deliberate, there&#39;s nothing to say that eve= ry<br> function called in a tree should be inlined. There&#39;s a high probability= <br> there&#39;s one/some that shouldn&#39;t be among a few that should.<br> </blockquote> <br></div> In the case of a long running loop it does not really matter. What it does = get you is a chance to use generic code (or libraries) and then do a first-= resort optimization. I basically see it as a time-saving feature (programme= rs time). A tool for cutting development costs.<br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"><div class=3D""> Remember too, that call-site inlining isn&#39;t the only method, there woul= d<br></div> also be always-inline...<br> </blockquote> <br> Yes, that is the first. I have in another thread some time ago suggested a = solution that use weighted inlining to aid compiler heuristics:<br> <br> <a href=3D"http://forum.dlang.org/thread/szjkyfpnachnnyknnfwp forum.dlang.o= rg#post-szjkyfpnachnnyknnfwp:40forum.dlang.org" target=3D"_blank">http://fo= rum.dlang.org/thread/<u></u>szjkyfpnachnnyknnfwp forum.<u></u>dlang.org#pos= t-<u></u>szjkyfpnachnnyknnfwp:40forum.<u></u>dlang.org</a><br> <br> As you can see I also suggested call-site inlining, so I am fully behind yo= u in this. :-) Lack of inlining and GC are my main objections to D.<div cla= ss=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> I think always-inline is what you want for some<br> decidedly trivial functions (although these will probably be heuristically<= br> inlined anyway), not call-site inlining.<br> </blockquote> <br></div> I agree. Compiler heuristics can change. It is desirable to be able to expr= ess intent no matter what the current heuristics are.<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> I just don&#39;t see how recursive<br> call-site inlining is appropriate, considering that call trees are often<br=

have<br> source for.<br> </blockquote> <br></div> You should not use it blindly.<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> You can cascade the mixin keyword if you want to, that&#39;s very simple.<b= r> </blockquote> <br></div> Not if you build the innerloop using generic components. I want this<br> <br> inline_everything while(conditon){<br> statement;<br> statement;<div class=3D""><br> }<br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> I&#39;d be highly surprised if you ever encountered a call tree where<br> you wanted to inline everything (and the optimiser didn&#39;t do it for you= ).<br> </blockquote> <br></div> Not if you move to high-level programming using prewritten code and only go= low level after profiling.<div class=3D""><br> <br> <blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-= left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p= adding-left:1ex"> As soon as you encounter a single function in the tree that shouldn&#39;t b= e<br> inlined, then you&#39;ll be forced to do it one level at a time anyway.<br> </blockquote> <br></div> But then you have to change the libraries you are using!?<br> <br> Nothing prevents you to introduce exceptions as an extension though. I want= inline(0.5) as default, but also be able to write inline(1) for inline alw= ays and inline(0) for inline never.<br> <br> func1(){} // implies inline(0.5) weighting<br> inline func2(){} // same as inline(1) weighting, inline always<br> inline(0.75) fun31(){} // increase the heuristics weighting<br> inline(0) func4(){} // never-ever inline<br> <br> Ola.<br> </blockquote></div><br></div><div class=3D"gmail_extra">I&#39;m sorry. I re= ally can&#39;t support any of these wildly complex ideas. I just don&#39;t = feel they&#39;re useful, and they&#39;re not very well founded.</div><div c= lass=3D"gmail_extra"> A numeric weight? What scale is it in? I&#39;m not sure of any &#39;standar= d-inline-weight-measure&#39; that any programmer would be able to intuitive= ly gauge the magic number against. That will simply never be agreed by the = devs.</div> <div class=3D"gmail_extra">It also doesn&#39;t make much sense... different= platforms will assign very different weights and different heuristics at t= he inliner. It&#39;s not a numeric quantity; it&#39;s a complex determinati= on whether a function is a good candidate or not.</div> <div class=3D"gmail_extra">The value you specify is likely highly context s= ensitive and probably not portable. Heuristic based Inlining should be left= to the optimiser to decide.</div><div class=3D"gmail_extra"><br></div><div= class=3D"gmail_extra"> And I totally object to recursive inlining. It has a kind of absolute natur= e that removes control all the way down the call tree, and I don&#39;t feel= it&#39;s likely that you would often (ever?) want to explicitly inline an = entire call tree.</div> <div class=3D"gmail_extra">If you want to inline a second level, then write= mixin in the second level. Recurse.</div><div class=3D"gmail_extra">You ar= e talking about generic code as if this isn&#39;t appropriate, but I specif= ically intend to use this in generic code very similar to what you suggest;= so I don&#39;t see the incompatibility.</div> <div class=3D"gmail_extra">I think you&#39;re saying like manually specifyi= ng it all the way down the call tree is inconvenient, but I would argue tha= t manually specifying *exclusions* throughout the call tree after specifyin= g a recursive inline is even more inconvenient. It requires more language (= a feature to mark an exclusion), has a kind of obtuse double-negative logic= about it, and it&#39;s equally invasive to your code.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">If you can = prove that single level call-site inlining doesn&#39;t satisfy your needs a= t some later time, make a proposal then, along with your real-world use cas= es. But by throwing it in this thread right now, you&#39;re kinda just kill= ing the thread, and making it very unlikely that anything will happen at al= l, which is annoying, because I REALLY need this (I&#39;ve been trying to m= otivate inline support for over 3 years), and I get the feeling you&#39;re = just throwing hypotheticals around.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">You&#39;re = still fairly new here, but be aware that feature requests will become expon= entially less likely to be accepted with every degree of complexity added. = By making this seem hard, you&#39;re also making it almost certain not to h= appen, which isn&#39;t in either of our interest.</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">My OP sugge= stion is the simplest solution I can conceive which will definitely satisfy= all the real-world use cases that I&#39;ve ever encountered. Is predictabl= e, portable, simple.</div> </div> --f46d04451757702de104f508f114--
Mar 20 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 20 March 2014 at 12:31:33 UTC, Manu wrote:
 I'm sorry. I really can't support any of these wildly complex 
 ideas.

They aren't actually complex, except tail-call optimization (but that is well understood).
 If you want to inline a second level, then write mixin in the 
 second level.

You might as well do copy-paste then. You cannot add inlining to an imported library without modifying it.
 at all, which is annoying, because I REALLY need this (I've 
 been trying to
 motivate inline support for over 3 years), and I get the 
 feeling you're
 just throwing hypotheticals around.

You need inlining, agree, but not 1 level mixin. Because you can do that with regular inlining.
Mar 20 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
Please note that 1 level mixin is not sufficient in the case of 
libraries. In too many cases you will not inline the function 
that does the work, only the interface wrapper.
Mar 20 2014
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
--001a11c2a20c6ffa0904f50b0b28
Content-Type: text/plain; charset=UTF-8

On 21 March 2014 00:10,
<7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 Please note that 1 level mixin is not sufficient in the case of libraries.
 In too many cases you will not inline the function that does the work, only
 the interface wrapper.

I don't think I would ever want to inline the whole call tree of a library. I've certainly never wanted to do anything like that in 20 years or so, and I've worked on some really performance critical systems, like amiga, dreamcast, ps2. It still sounds really sketchy. If the function that does the work is a few levels deep, then there is probably a good reason for that. What if there's an error check that writes log output or something? Or some branch that leads to other uncommon paths? I think you're making this problem up. Can you demonstrate where this has been a problem for you in the past? The call tree would have to be so very particular for this to be appropriate, and then you say this is a library, which you have no control over... so the call tree is just perfect by chance? What if the library changes? --001a11c2a20c6ffa0904f50b0b28 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote">On 2= 1 March 2014 00:10, <span dir=3D"ltr">&lt;<a href=3D"mailto:7d89a89974b0ff= 40.invalid internationalized.invalid" target=3D"_blank">7d89a89974b0ff40.in= valid internationalized.invalid</a>&gt;</span> wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex">Please note that 1 level mixin is not suffic= ient in the case of libraries. In too many cases you will not inline the fu= nction that does the work, only the interface wrapper.<br> </blockquote></div><br></div><div class=3D"gmail_extra">I don&#39;t think I= would ever want to inline the whole call tree of a library. I&#39;ve certa= inly never wanted to do anything like that in 20 years or so, and I&#39;ve = worked on some really performance critical systems, like amiga, dreamcast, = ps2.</div> <div class=3D"gmail_extra">It still sounds really sketchy. If the function = that does the work is a few levels deep, then there is probably a good reas= on for that. What if there&#39;s an error check that writes log output or s= omething? Or some branch that leads to other uncommon paths?</div> <div class=3D"gmail_extra"><br></div><div class=3D"gmail_extra">I think you= &#39;re making this problem up. Can you demonstrate where this has been a p= roblem for you in the past?</div><div class=3D"gmail_extra">The call tree w= ould have to be so very particular for this to be appropriate, and then you= say this is a library, which you have no control over... so the call tree = is just perfect by chance? What if the library changes?</div> </div> --001a11c2a20c6ffa0904f50b0b28--
Mar 20 2014
prev sibling next sibling parent "ponce" <contact gam3sfrommars.fr> writes:
On Thursday, 20 March 2014 at 08:35:22 UTC, Ola Fosheim Grøstad 
wrote:
 Nothing prevents you to introduce exceptions as an extension 
 though. I want inline(0.5) as default, but also be able to 
 write inline(1) for inline always and inline(0) for inline 
 never.

 func1(){} // implies inline(0.5) weighting
 inline func2(){} // same as inline(1) weighting, inline always
 inline(0.75) fun31(){} // increase the heuristics weighting
 inline(0) func4(){} // never-ever inline

It looks promising when seen like that, but introducing explicit inlining/deinlining to me correspond to a precise process: 1. Bottleneck is identified. 2. "we could {inline|deinline} this call at this particular place and see what happens" 3. Apply inline directive for this call. Only "always" or "never" is ever wanted for me, and for 1 level only. 4. Measure and validate like all optimizations. Now after this, even if the inlining become harmful for other reasons, I want this inlining to be maintained, whatever the cost, not subject to random rules I don't know of. When you tweak inlining, you are supposed to know what you are doing, and it's not just an optimization, it's an essential tool that enables other optimizations, help disambiguate aliasing, help the auto-vectorizer, help constant propagation... In the large majority of cases it can be left to the compiler, and in the 1% cases that matters I want to do it explicitely full stop.
Mar 20 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 20 March 2014 at 15:26:35 UTC, ponce wrote:
 Now after this, even if the inlining become harmful for other 
 reasons, I want this inlining to be maintained, whatever the 
 cost, not subject to random rules I don't know of. When you

The rules aren't random. The inliner conceptually use weighting anyway, you just increase the threshold for a specific call-tree. E.g. if a function is on the borderline of being inlined the probability is 50% if you add some noise to the selection with a magnitude that equals the "typical approximation error" of the heuristics. "inline(0.75)" should increase the probability to 75%. Today all functions have an implied "inline(0.5)". I think you should have this kind of control for all compiler heuristics thresholds that are "arbitrary", not only inlining. Call site inlining is primarily useful for inlining external code. The alternative is usually to replace libraries with your own version.
Mar 20 2014
prev sibling parent "Puming" <zhaopuming gmail.com> writes:
Maybe we could have both declare site inlining and call site 
inlining.

with declare site, what we mean is that this function's body is 
used so commonly that we make it into a function only because we 
don't want duplicate code, not because it should be a standalone 
function.

with call site inlining, one can inline thirdparty functions 
which is not declared inline.

I think the `inline` Manu suggested should not be viewed as a 
mere optimization thing, but more like a code generation utility 
which happens to be faster. In this point of view, this kind of 
`inline` should be controlled by the coder, not the compiler.

To make it clear that we are not talking about optimization, 
maybe we should call it another name, like 'mixin function'?

BTW, the Kotlin language recently get a new released, which added 
support for declare site force inline, the team argues its 
necessity here:

http://blog.jetbrains.com/kotlin/2014/03/m7-release-available/#more-1439

in the comments:
It’s traditional to think about inlining as a mere optimization, 
but this dates back to the times >when software was shipped as 
one huge binary file.

Why we think inline should be a language feature:
1. Other language features (to be implemented soon) depend on 
it. Namely, non-local returns >and type-dependent functions. 
Basically, inline functions are very restricted macros, and this 
is definitely a language feature.

not be up to the compiler >whether to inline something or not on the JVM: if bodies of inline functions change, all >dependent code should be recompiled, i.e. it’s the library author’s liability to preserve >functionality, so such functions must be explicitly marked.

On Thursday, 20 March 2014 at 02:08:16 UTC, Manu wrote:
 On 20 March 2014 06:23,
 <7d89a89974b0ff40.invalid internationalized.invalid>wrote:

 On Wednesday, 19 March 2014 at 12:35:30 UTC, Manu wrote:

 Okay, do you have use cases for any of this stuff? Are you 
 just making it
 up, or do you have significant experience to say this is what 
 you need?

I don't need anything, I hand optimize prematurely. And I don't want to do that. But yes, inner loops benefits from exhaustive inlining because you get to move common expressions out of the loop or change them to delta increments. It is only when you trash the caches that inlining does not pay off. I do it by hand. I don't want to do it by hand. If you ask me, I have no value in recursive inlining, infact, that would
 anger me considerably.

Why? You could always set the depth to 1, or make 1 the default. And it isn't difficult to implement.

The problem is upside down. If you want to inline multiple levels, you start from the leaves and move downwards, not from the root moving upwards (leaving a bunch of leaves perhaps not inlined), which is what you're really suggesting. Inlining should be strictly deliberate, there's nothing to say that every function called in a tree should be inlined. There's a high probability there's one/some that shouldn't be among a few that should. Remember too, that call-site inlining isn't the only method, there would also be always-inline. I think always-inline is what you want for some decidedly trivial functions (although these will probably be heuristically inlined anyway), not call-site inlining. I just don't see how recursive call-site inlining is appropriate, considering that call trees are often complex, subject to change, and may even call functions that you don't have source for. You can cascade the mixin keyword if you want to, that's very simple. I'd be highly surprised if you ever encountered a call tree where you wanted to inline everything (and the optimiser didn't do it for you). As soon as you encounter a single function in the tree that shouldn't be inlined, then you'll be forced to do it one level at a time anyway.

Mar 23 2014