digitalmars.D - DIP56 Provide pragma to control function inlining

Walter Bright (3/3) Feb 23 2014 http://wiki.dlang.org/DIP56

Mike (12/15) Feb 23 2014 Is this a front-end thing or something specific to DMD? I'm

Walter Bright (4/12) Feb 23 2014 It's a hint to the compiler - the compiler is allowed to ignore it if it...

Benjamin Thaut (3/6) Feb 23 2014 Why a pragma? Can't we use a UDA and give it some special meaning inside...

Walter Bright (3/5) Feb 23 2014 This shouldn't be an attribute, it's a hint to the compiler optimizer. P...
Namespace (10/18) Feb 23 2014 +1

Namespace (6/26) Feb 23 2014 I still prefer the attribute/UDA idea but in case of pragma:

Walter Bright (2/5) Feb 23 2014 'default' being a keyword makes for an ugly special case in how pragmas ...

Lionello Lunesu (2/8) Feb 23 2014 Aren't true and false keywords as well?

Walter Bright (2/12) Feb 23 2014 Yes, but the are also expressions. default is not.

Tove (11/14) Feb 23 2014 yay, all for it! The DIP should probably specify what happens if

Dmitry Olshansky (4/18) Feb 23 2014 Yes, please.
Walter Bright (4/6) Feb 23 2014 I suspect that may cause problems, because different compilers will have...

Vladimir Panteleev (9/15) Feb 23 2014 I think there should be some way to force the compiler to inline

Walter Bright (8/15) Feb 23 2014 I think it would be a porting nuisance to error out when the compiler ca...

Dicebot (4/8) Feb 23 2014 Ok, you are at this point, check assembly and find out that
Dmitry Olshansky (10/31) Feb 23 2014 Porting across compilers you mean?
Francesco Cattoglio (10/14) Feb 23 2014 Not everyone has time/knowledge for checking the ASM at every

Walter Bright (2/3) Feb 23 2014 I addressed these three messages in another reply to Dmitry.

Francesco Cattoglio (6/9) Feb 23 2014 Read that, and you do make a point. I am no expert on

Dicebot (3/3) Feb 23 2014 As a compromise diagnostics about refused inlining can be added

Dmitry Olshansky (10/16) Feb 23 2014 It's going to be near useless if it doesn't make sure inlining happened.

Francesco Cattoglio (3/5) Feb 23 2014 I completely agree.
Joseph Cassman (6/26) Feb 23 2014 That is most likely when I would make use of the concept too. And
Dicebot (6/10) Feb 23 2014 Optional recommendation for inlining already exists - it is

Walter Bright (4/5) Feb 23 2014 That is not the point of the pragma. The point of always inlining is (as...

Walter Bright (7/10) Feb 23 2014 function just to make sure it was inlined.

Dmitry Olshansky (15/28) Feb 23 2014 That programmer is instantly aware that it can't be done due to some

Walter Bright (20/38) Feb 23 2014 I'm aware of that, but once you add the:

Dicebot (4/14) Feb 23 2014 Once one resorts to force_inline and similar micro-optimisations
Walter Bright (16/18) Feb 23 2014 BTW, just to reiterate, there are *thousands* of optimizations the compi...
QAston (9/12) Feb 23 2014 This is exactly what caused mess with http user agent info when
Dmitry Olshansky (44/93) Feb 23 2014 You actually going against yourself with this argument - for porting you...

Walter Bright (6/16) Feb 23 2014 There's not much choice about that. I also suggest moving such code into...

Mike (4/6) Feb 23 2014 The difference is it was explicitly told do do something and

Walter Bright (23/25) Feb 23 2014 I view this as more in the manner of providing the equivalent of runtime...

Dicebot (5/13) Feb 23 2014 The fact that original C "inline" was designed in same

Araq (4/8) Feb 23 2014 Do you mind to back up your "fact" with some numbers? Afaict

Dicebot (11/16) Feb 24 2014 I can't link you closed projects I have been working on before so

Andrei Alexandrescu (10/40) Feb 23 2014 I'll add an anecdote - in HHVM we owe a lot of speedups to the careful

Vladimir Panteleev (13/22) Feb 23 2014 I think there is another, distinct use case for an inline pragma

Andrei Alexandrescu (3/23) Feb 23 2014 Sounds fair enough.

Jerry (5/9) Feb 24 2014 pragma(inline, false);

francesco cattoglio (7/18) Feb 25 2014 Personally I like it. Perhaps you forgot

Dmitry Olshansky (30/63) Feb 24 2014 Speaking of other optimizations.

Dmitry Olshansky (19/37) Feb 24 2014 GCC has these attributes (including flatten to fully unroll all calls in...

Tove (24/28) Feb 23 2014 If I need to support multiple compilers and if one of them is not

Walter Bright (25/32) Feb 23 2014 Again, this is treating 'inline' as being the only optimization that mat...

Brad Roberts (4/11) Feb 23 2014 At this point, you're starting to argue that the entire DIP isn't

Walter Bright (17/20) Feb 23 2014 1. It provides information to the compiler about runtime frequency that ...

Steven Schveighoffer (34/59) Feb 23 2014 But you are under-utilizing the message. There is the case that one want...
Andrei Alexandrescu (13/23) Feb 23 2014 In HHVM we plainly ask for specific decisions on inlining or not. We
francesco cattoglio (11/22) Feb 23 2014 This answers to your own previous question: this is what makes
Iain Buclaw (19/31) Feb 23 2014 it
Dicebot (4/6) Feb 24 2014 This is also useful feature, especially when also applicable to
ponce (22/37) Feb 24 2014 I'm not sure what it is impossible to inline in some case, I've

Manu (22/34) Feb 24 2014 For those interested, in my experience, the value of inlining is rarely

Kapps (7/17) Feb 24 2014 Perhaps something like a -vinline similar to -vtls? You don't

Dicebot (4/10) Feb 24 2014 As I have already mentioned in this thread, there already does

deadalnix (11/43) Feb 24 2014 It highly depends on the architecture you run on. X86 is

Tove (4/10) Feb 23 2014 Would assert be feasible or difficult to implement with the

Iain Buclaw (2/16) Feb 23 2014 WAT!

Manu (4/11) Feb 24 2014 Does this depend how it is implemented?

Andrej Mitrovic (12/15) Feb 23 2014 What if you want to mark a series of functions to be inlined?

Walter Bright (2/11) Feb 23 2014 That can work because pragmas can have blocks associated with them.

Dmitry Olshansky (12/15) Feb 23 2014 Why pragma? Also how exactly it is supposed to work:

Walter Bright (12/17) Feb 23 2014 T func(args)

Joseph Cassman (16/34) Feb 23 2014 Thanks for the code example. That helped me better understand

dennis luehring (10/24) Feb 23 2014 yea it feels strange - like naked in inline asm

ponce (12/15) Feb 23 2014 This is great. I bet this will be useful.

Walter Bright (7/23) Feb 23 2014 Or better:

Manu (3/34) Feb 24 2014 Really? I think you're just trying to be different for the sake of being

Joseph Rushton Wakeling (9/12) Feb 23 2014 Sounds good in principle. So, if I understand right, a pragma(inline, t...

Walter Bright (5/11) Feb 23 2014 I'll add:

Andrej Mitrovic (5/7) Feb 23 2014 That's just going to confuse people, because they'll think *this*

Walter Bright (4/11) Feb 23 2014 Perhaps, but there's precedent with how align works, and how default

tn (8/14) Feb 23 2014 pragma(inline_always)

Xavier Bigand (9/12) Feb 23 2014 I saw many times C++ developers works on applications doesn't need such
Andrei Alexandrescu (6/9) Feb 23 2014 This makes inlining dependent on previously-seen code. Would that make

Meta (5/11) Feb 23 2014 That seems to be how Rust does it, but I'm not really clear how
bearophile (5/8) Feb 23 2014 Seems good. And what do you think the D compiler should do when

Andrei Alexandrescu (3/8) Feb 23 2014 Compile-time error, no two ways about it.

Walter Bright (5/9) Feb 23 2014 I don't understand the question. Inlining always depends on the compiler...

Andrei Alexandrescu (3/9) Feb 23 2014 Decision to inline at line 2000 may be caused by a pragma in line 2.

Walter Bright (4/13) Feb 24 2014 I still don't understand the question. Successfully compiling anything i...

Andrei Alexandrescu (3/17) Feb 24 2014 Probably it makes no difference, sorry for the distraction.

Iain Buclaw (17/26) Feb 23 2014 parallel compilation more difficult?

Lionello Lunesu (18/21) Feb 23 2014 void A()

Walter Bright (6/27) Feb 24 2014 No. This would be:

Manu (26/29) Feb 24 2014 This will probably do, but I still don't understand why not a function
Manu (9/39) Feb 24 2014 Note; GDC and LDC already have inline attributes. It's a pain in the ars...

Walter Bright <newshound2 digitalmars.com> writes:

http://wiki.dlang.org/DIP56

Manu has needed always inlining, and I've needed never inlining. This DIP 
proposes a simple solution.

Feb 23 2014

"Mike" <none none.com> writes:

On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never 
 inlining. This DIP proposes a simple solution.

Is this a front-end thing or something specific to DMD?  I'm 
wondering because I'd like something like this for GDC and LCD 
when targeting ARM microcontrollers.  The inline keyword makes 
quite a significant performance improvement in one of my current 
C++ projects, and I anticipate the same result when I convert it 
to D.

Any chance of adding a "optimize, true/false" pragma also to get 
around the lack of a volatile keyword? (Just a question, I don't 
mean to hijack this thread and turn into another volatile keyword 
debate).

Mike

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:21 AM, Mike wrote:
 Is this a front-end thing or something specific to DMD?  I'm wondering because
 I'd like something like this for GDC and LCD when targeting ARM
 microcontrollers.  The inline keyword makes quite a significant performance
 improvement in one of my current C++ projects, and I anticipate the same result
 when I convert it to D.

It's a hint to the compiler - the compiler is allowed to ignore it if it
doesn't 
support it.


 Any chance of adding a "optimize, true/false" pragma also to get around the
lack
 of a volatile keyword? (Just a question, I don't mean to hijack this thread and
 turn into another volatile keyword debate).

Please start another thread with your proposal.

Feb 23 2014

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 23.02.2014 13:07, schrieb Walter Bright:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

Why a pragma? Can't we use a UDA and give it some special meaning inside 
the compiler?

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:25 AM, Benjamin Thaut wrote:
 Why a pragma? Can't we use a UDA and give it some special meaning inside the
 compiler?

This shouldn't be an attribute, it's a hint to the compiler optimizer. Pragma
is 
ideally suited to that.

Feb 23 2014

"Namespace" <rswhite4 googlemail.com> writes:

On Sunday, 23 February 2014 at 12:25:20 UTC, Benjamin Thaut wrote:
 Am 23.02.2014 13:07, schrieb Walter Bright:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never 
 inlining. This
 DIP proposes a simple solution.

 Why a pragma? Can't we use a UDA and give it some special 
 meaning inside the compiler?

+1
I would also prefer an attribute which can be used as label.

----
 inline(true):
// ...
 inline(false):
// ...
 inline(default):
----

Feb 23 2014

"Namespace" <rswhite4 googlemail.com> writes:

On Sunday, 23 February 2014 at 19:10:08 UTC, Namespace wrote:
 On Sunday, 23 February 2014 at 12:25:20 UTC, Benjamin Thaut 
 wrote:
 Am 23.02.2014 13:07, schrieb Walter Bright:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never 
 inlining. This
 DIP proposes a simple solution.

 Why a pragma? Can't we use a UDA and give it some special 
 meaning inside the compiler?

 +1
 I would also prefer an attribute which can be used as label.

 ----
  inline(true):
 // ...
  inline(false):
 // ...
  inline(default):
 ----

I still prefer the attribute/UDA idea but in case of pragma:

pragma(inline, true);
pragma(inline, false);
pragma(inline, default);

?

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 1:41 PM, Namespace wrote:
 pragma(inline, true);
 pragma(inline, false);
 pragma(inline, default);

'default' being a keyword makes for an ugly special case in how pragmas are
parsed.

Feb 23 2014

Lionello Lunesu <lionello lunesu.remove.com> writes:

On 24/02/14 06:12, Walter Bright wrote:
 On 2/23/2014 1:41 PM, Namespace wrote:
 pragma(inline, true);
 pragma(inline, false);
 pragma(inline, default);

 'default' being a keyword makes for an ugly special case in how pragmas
 are parsed.

Aren't true and false keywords as well?

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 5:47 PM, Lionello Lunesu wrote:
 On 24/02/14 06:12, Walter Bright wrote:
 On 2/23/2014 1:41 PM, Namespace wrote:
 pragma(inline, true);
 pragma(inline, false);
 pragma(inline, default);

 'default' being a keyword makes for an ugly special case in how pragmas
 are parsed.

 Aren't true and false keywords as well?

Yes, but the are also expressions. default is not.

Feb 23 2014

"Tove" <tove fransson.se> writes:

On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never 
 inlining. This DIP proposes a simple solution.

yay, all for it! The DIP should probably specify what happens if 
inlining fails, i.e. generate a compilation error.

Could we consider adding "flatten" in the same dip?

quote from gcc
"Flatten
Generally, inlining into a function is limited. For a function 
marked with this attribute, every call inside this function is 
inlined, if possible. Whether the function itself is considered 
for inlining depends on its size and the current inlining 
parameters. "

Feb 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Feb-2014 16:25, Tove пишет:
 On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

 yay, all for it! The DIP should probably specify what happens if
 inlining fails, i.e. generate a compilation error.

 Could we consider adding "flatten" in the same dip?

 quote from gcc
 "Flatten
 Generally, inlining into a function is limited. For a function marked
 with this attribute, every call inside this function is inlined, if
 possible. Whether the function itself is considered for inlining depends
 on its size and the current inlining parameters. "

Yes, please.

-- 
Dmitry Olshansky

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:25 AM, Tove wrote:
 The DIP should probably specify what happens if inlining fails,
 i.e. generate a compilation error.

I suspect that may cause problems, because different compilers will have 
different inlining capabilities. I think it should be a 'recommendation' to the 
compiler.

Feb 23 2014

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Sunday, 23 February 2014 at 12:57:00 UTC, Walter Bright wrote:
 On 2/23/2014 4:25 AM, Tove wrote:
 The DIP should probably specify what happens if inlining fails,
 i.e. generate a compilation error.

 I suspect that may cause problems, because different compilers 
 will have different inlining capabilities. I think it should be 
 a 'recommendation' to the compiler.

I think there should be some way to force the compiler to inline 
a function. As a bonus, the error message can tell the programmer 
why the function could not be inlined, allowing them to make the 
necessary adjustments.

Different compilers will have different inlining capabilities, 
however at the point where programmers are forcing inlining on or 
off, they are already micro-optimizing at a level which implies 
dependency on particular compiler implementations.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 5:01 AM, Vladimir Panteleev wrote:
 I think there should be some way to force the compiler to inline a function. As
 a bonus, the error message can tell the programmer why the function could not
be
 inlined, allowing them to make the necessary adjustments.

 Different compilers will have different inlining capabilities, however at the
 point where programmers are forcing inlining on or off, they are already
 micro-optimizing at a level which implies dependency on particular compiler
 implementations.

I think it would be a porting nuisance to error out when the compiler can't 
inline. The user would then fix it by versioning out for that compiler, and
then 
the user is back to the same state as it being a recommendation.

Generally, when I optimize at that level, I have a window open on the assembler 
output of the compiler and I go back and forth on the source code until I get 
the shape of the assembler I need. Having compiler messages wouldn't be very 
helpful.

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

On Sunday, 23 February 2014 at 20:40:44 UTC, Walter Bright wrote:
 Generally, when I optimize at that level, I have a window open 
 on the assembler output of the compiler and I go back and forth 
 on the source code until I get the shape of the assembler I 
 need. Having compiler messages wouldn't be very helpful.

Ok, you are at this point, check assembly and find out that 
compiler ignores your recommendation with no error messages / 
explanations. Next step?

Feb 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Feb-2014 00:40, Walter Bright пишет:
 On 2/23/2014 5:01 AM, Vladimir Panteleev wrote:
 I think there should be some way to force the compiler to inline a
 function. As
 a bonus, the error message can tell the programmer why the function
 could not be
 inlined, allowing them to make the necessary adjustments.

 Different compilers will have different inlining capabilities, however
 at the
 point where programmers are forcing inlining on or off, they are already
 micro-optimizing at a level which implies dependency on particular
 compiler
 implementations.

 I think it would be a porting nuisance to error out when the compiler
 can't inline. The user would then fix it by versioning out for that
 compiler, and then the user is back to the same state as it being a
 recommendation.

Porting across compilers you mean?
While porting making temporary changes is fine, like turning off 
force_inline where it doesn't work. Without this error you are facing a 
silent performance disaster you still need to figure out.

Fail fast for the win.

 Generally, when I optimize at that level, I have a window open on the
 assembler output of the compiler and I go back and forth on the source
 code until I get the shape of the assembler I need. Having compiler
 messages wouldn't be very helpful.

Will save you the trouble of looking at the assembly window to begin 
with. Because you known ahead of time you wouldn't see what you like.

-- 
Dmitry Olshansky

Feb 23 2014

"Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:

On Sunday, 23 February 2014 at 20:40:44 UTC, Walter Bright wrote:
 Generally, when I optimize at that level, I have a window open 
 on the assembler output of the compiler and I go back and forth 
 on the source code until I get the shape of the assembler I 
 need. Having compiler messages wouldn't be very helpful.

Not everyone has time/knowledge for checking the ASM at every 
recompile. Personally I wouldn't be able to do something like 
this that much often, and yet I'd love to know that something is 
not working ASAP.

Code changes, and it changes a lot during development. Having a 
way to make sure that one or more functions stay inlined is handy 
to have. If such a pragma doesn't guarantee inlining, that means 
we will have no way to check it quickly. Sometimes fail fast is 
really the best choice.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 1:32 PM, Francesco Cattoglio wrote:
 [...]

I addressed these three messages in another reply to Dmitry.

Feb 23 2014

"Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:

On Sunday, 23 February 2014 at 21:55:11 UTC, Walter Bright wrote:
 On 2/23/2014 1:32 PM, Francesco Cattoglio wrote:
 [...]

 I addressed these three messages in another reply to Dmitry.

Read that, and you do make a point. I am no expert on 
optimization, but as far as I could tell, inlining is usually the 
easiest and most rewarding of the optimizations one can do. I 
know you kind of hate warnings, but perhaps we could at least get 
a warning if something cannot be inlined?

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

As a compromise diagnostics about refused inlining can be added 
as special output category to 
https://github.com/D-Programming-Language/dmd/pull/645

Feb 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Feb-2014 16:57, Walter Bright пишет:
 On 2/23/2014 4:25 AM, Tove wrote:
 The DIP should probably specify what happens if inlining fails,
 i.e. generate a compilation error.

 I suspect that may cause problems, because different compilers will have
 different inlining capabilities. I think it should be a 'recommendation'
 to the compiler.

It's going to be near useless if it doesn't make sure inlining happened.
Part of the reason for forced inline is always inlining some core 
primitives, even in debug builds.

The other point is what Vladimir mentioned - we already doing 
micro-optimization, hence it better error out then turn a blind eye on 
our tinkering. I wouldn't not like to ever have to get down and look at 
ASM for every function just to make sure it was inlined.

-- 
Dmitry Olshansky

Feb 23 2014

"Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:

On Sunday, 23 February 2014 at 13:07:27 UTC, Dmitry Olshansky 
wrote:
 It's going to be near useless if it doesn't make sure inlining 
 happened.

I completely agree.

Feb 23 2014

"Joseph Cassman" <jc7919 outlook.com> writes:

On Sunday, 23 February 2014 at 13:07:27 UTC, Dmitry Olshansky 
wrote:
 23-Feb-2014 16:57, Walter Bright пишет:
 On 2/23/2014 4:25 AM, Tove wrote:
 The DIP should probably specify what happens if inlining 
 fails,
 i.e. generate a compilation error.

 I suspect that may cause problems, because different compilers 
 will have
 different inlining capabilities. I think it should be a 
 'recommendation'
 to the compiler.

 It's going to be near useless if it doesn't make sure inlining 
 happened.
 Part of the reason for forced inline is always inlining some 
 core primitives, even in debug builds.

 The other point is what Vladimir mentioned - we already doing 
 micro-optimization, hence it better error out then turn a blind 
 eye on our tinkering. I wouldn't not like to ever have to get 
 down and look at ASM for every function just to make sure it 
 was inlined.

That is most likely when I would make use of the concept too. And 
a message from the compiler in its output telling me when such an 
inline request failed would be helpful.

Joseph

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

On Sunday, 23 February 2014 at 13:07:27 UTC, Dmitry Olshansky 
wrote:
 It's going to be near useless if it doesn't make sure inlining 
 happened.
 Part of the reason for forced inline is always inlining some 
 core primitives, even in debug builds.

Optional recommendation for inlining already exists - it is 
current default. This pragma needs to result in compile-time 
error if used where inlining is not possible to be any useful.

Other than that, looks fine.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 11:04 AM, Dicebot wrote:
 Optional recommendation for inlining already exists - it is current default.

That is not the point of the pragma. The point of always inlining is (as Manu 
explained) some functions need to be inlined even in debug mode, as the code 
would otherwise be too slow to even debug.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 5:07 AM, Dmitry Olshansky wrote:
 Part of the reason for forced inline is always inlining some core primitives,
 even in debug builds.

Right - and if the compiler won't do it, how does the error message help?

 I wouldn't not like to ever have to get down and look at ASM for every 

function just to make sure it was inlined.

By the time you get to the point of checking on inlining, you're already
looking 
at the assembler output, because the function is on the top of the profile of 
time wasters, and that's how you take it to the next level of performance.

The trouble with an error message, is what (as the user) can you do about it?

Feb 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Feb-2014 00:46, Walter Bright пишет:
 On 2/23/2014 5:07 AM, Dmitry Olshansky wrote:
 Part of the reason for forced inline is always inlining some core
 primitives,
 even in debug builds.

 Right - and if the compiler won't do it, how does the error message help?

That programmer is instantly aware that it can't be done due to some 
reason. Keep in mind that code changes with time and running 
profiler/disassembler on every tiny change to make sure the stuff is 
still inlined is highly counter-productive.

  > I wouldn't not like to ever have to get down and look at ASM for
 every function just to make sure it was inlined.

 By the time you get to the point of checking on inlining, you're already
 looking at the assembler output, because the function is on the top of
 the profile of time wasters, and that's how you take it to the next
 level of performance.

A one-off activity. Now what guarantees you will have that it will keep 
getting inlined? Right, nothing.
 The trouble with an error message, is what (as the user) can you do
 about it?

Re-write till compiler loves it, that is what we do today anyway. Else 
we wouldn't mark it as force_inline in the first place.

With error - yo get a huge advantage - an _instant_ feedback that it 
doesn't do what you want it to do. Otherwise it gets the extra pleasure 
of running disassembler to pinpoint your favorite call sites or 
observing that your profiler shows the same awful stats.

-- 
Dmitry Olshansky

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 1:04 PM, Dmitry Olshansky wrote:
 That programmer is instantly aware that it can't be done due to some reason.
 Keep in mind that code changes with time and running profiler/disassembler on
 every tiny change to make sure the stuff is still inlined is highly
 counter-productive.

I'm aware of that, but once you add the:

     version(BadCompiler) { } else pragma(inline, true);

things will never get better for BadCompiler. And besides, that line looks
awful.


 By the time you get to the point of checking on inlining, you're already
 looking at the assembler output, because the function is on the top of
 the profile of time wasters, and that's how you take it to the next
 level of performance.

 A one-off activity. Now what guarantees you will have that it will keep getting
 inlined? Right, nothing.

You're always going to have that issue when optimizing at that level, and it 
will be for a large range of constructs. For example, you may need variable x
to 
be enregistered. You may need some construct to be implemented as a ROL 
instruction. You may need a switch to be implemented as a binary search.


 The trouble with an error message, is what (as the user) can you do
 about it?

 Re-write till compiler loves it, that is what we do today anyway. Else we
 wouldn't mark it as force_inline in the first place.

In which case there will be two code paths selected with a
version(BadCompiler). 
I have a hard time seeing the value in supporting both code paths - the 
programmer would just use the workaround code always.


 With error - yo get a huge advantage - an _instant_ feedback that it doesn't do
 what you want it to do. Otherwise it gets the extra pleasure of running
 disassembler to pinpoint your favorite call sites or observing that your
 profiler shows the same awful stats.

My point is you're going to have to look at the asm of the top functions on the 
profiler stats anyway, or you're wasting your time trying to optimize the code. 
(Speaking from considerable experience doing that.) There's a heluva lot more
to 
optimizing effectively than inlining, and it takes some back-and-forth tweaking 
source code and looking at the assembler. I gave some examples of that above.

And yes, performance critical code often suffers from bit rot, and changes in 
the compiler, and needs to be re-tuned now and then.

I suspect if the compiler errors out on a failed inline, it'll be much less 
useful than one might think.

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

On Sunday, 23 February 2014 at 21:53:43 UTC, Walter Bright wrote:
 On 2/23/2014 1:04 PM, Dmitry Olshansky wrote:
 That programmer is instantly aware that it can't be done due 
 to some reason.
 Keep in mind that code changes with time and running 
 profiler/disassembler on
 every tiny change to make sure the stuff is still inlined is 
 highly
 counter-productive.

 I'm aware of that, but once you add the:

     version(BadCompiler) { } else pragma(inline, true);

Once one resorts to force_inline and similar micro-optimisations 
he usually sticks to single "good" compiler as code gen needs to 
be re-profiled for each compiler anyway.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 1:53 PM, Walter Bright wrote:
 And yes, performance critical code often suffers from bit rot, and changes in
 the compiler, and needs to be re-tuned now and then.

BTW, just to reiterate, there are *thousands* of optimizations the compiler may 
or may not do. And yes, performance critical code will often rely on them, and 
code is often tuned to 'tickle' certain ones.


For example, I know a fellow years ago who thought he had invented a
spectacular 
new string processing algorithm. He had the benchmarks to prove it, and 
published an article with his with/without benchmark.

Unfortunately, the without benchmark contained an extra DIV instruction that, 
due to the vagaries of optimization, the compiler hadn't elided. That DIV had 
nothing to do with the algorithm, but the benchmark timing differences were 
totally due to its presence/absence.

He would have spotted it if he'd ever looked at the asm generated, and saved 
himself from some embarrassment.


I understand that in an ideal world one should never have to look at asm, but
if 
you're writing high performance code and don't look at asm, the code is never 
going to beat the competition.

Feb 23 2014

"QAston" <qaston gmail.com> writes:

On Sunday, 23 February 2014 at 21:53:43 UTC, Walter Bright wrote:
 I'm aware of that, but once you add the:

     version(BadCompiler) { } else pragma(inline, true);

 things will never get better for BadCompiler.

This is exactly what caused mess with http user agent info when 
both browsers tried to present web pages better and web devs 
tried to tune their pages to browsers with distinct features. Now 
chrome says it's Mozilla, khtml, gecko and safari. But, is that 
really a problem? I don't think much code relies on compiler 
intrinsics. If it does perhaps a way to specify attributes in one 
place and then reference those (like CUSTOM_INLINE define in C) 
would help.

Feb 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Feb-2014 01:53, Walter Bright пишет:
 On 2/23/2014 1:04 PM, Dmitry Olshansky wrote:
 That programmer is instantly aware that it can't be done due to some
 reason.
 Keep in mind that code changes with time and running
 profiler/disassembler on
 every tiny change to make sure the stuff is still inlined is highly
 counter-productive.

 I'm aware of that, but once you add the:

      version(BadCompiler) { } else pragma(inline, true);

 things will never get better for BadCompiler. And besides, that line
 looks awful.

You actually going against yourself with this argument - for porting you 
typically suggest:

version(OS1)
  ...
else version(OS2)
  ...
else
static assert(0);

Why forced_inline is any different then other porting (where you want 
fail fast).
 By the time you get to the point of checking on inlining, you're already
 looking at the assembler output, because the function is on the top of
 the profile of time wasters, and that's how you take it to the next
 level of performance.

 A one-off activity. Now what guarantees you will have that it will
 keep getting
 inlined? Right, nothing.

 You're always going to have that issue when optimizing at that level,
 and it will be for a large range of constructs. For example, you may
 need variable x to be enregistered. You may need some construct to be
 implemented as a ROL instruction. You may need a switch to be
 implemented as a binary search.

Let's not detract from original point. ROL is done as an instrinsic, and 
there are different answers to many of these questions that are BETTER 
then _always_ triple checking by hand and doing re-writes. Switch may 
benefit from pragmas as well, and modern compiler allow tweaking it. In 
fact LLVM allows assigning weights to specify which cases are more probable.

Almost all of listed issues could be addressed better then dancing 
around disassembler and trying to please PARTICULAR COMPILER for many 
cases you listed above.

Yes, looking at ASM is important but no not every single case should 
require the painful cycle of:
compile->disassemble-->re-write-->compile-->...

 The trouble with an error message, is what (as the user) can you do
 about it?

 Re-write till compiler loves it, that is what we do today anyway. Else we
 wouldn't mark it as force_inline in the first place.

 In which case there will be two code paths selected with a
 version(BadCompiler). I have a hard time seeing the value in supporting
 both code paths - the programmer would just use the workaround code always.

Your nice tired and true way of doing things is EQUALLY FRAGILE (if not 
more) and highly coupled to the compiler but only SILENTLY so.

 With error - yo get a huge advantage - an _instant_ feedback that it
 doesn't do
 what you want it to do. Otherwise it gets the extra pleasure of running
 disassembler to pinpoint your favorite call sites or observing that your
 profiler shows the same awful stats.

 My point is you're going to have to look at the asm of the top functions
 on the profiler stats anyway, or you're wasting your time trying to
 optimize the code.

Like I don't know already, getting in this discussion.

 (Speaking from considerable experience doing that.)

And since you've come to enjoy it as is, you accept no improvements over 
that process? So you known it's hard fighting the compiler and you 
decidedly as a samurai reject any help messing with it. I seriously 
don't get the point.

GCC has force inline, let's look at what GCC does with its always_inline:
http://gcc.gnu.org/ml/gcc-help/2007-01/msg00051.html

Quote of interest:

---

 **5) Could there be any situation, where a function with always_inline
 is _silently_ not embedded?

I hope not.  I don't know of any.

---

 There's a heluva lot more to optimizing effectively than inlining, and
 it takes some back-and-forth tweaking source code and looking at the
 assembler. I gave some examples of that above.

Just because there are other reasons to look at disassembly is not a 
good reason to forcibly send people to double-check compiler for basic 
inlining.

 And yes, performance critical code often suffers from bit rot, and
 changes in the compiler, and needs to be re-tuned now and then.

And you accept no safe-guards against this because that is "the true old 
way"?

 I suspect if the compiler errors out on a failed inline, it'll be much
 less useful than one might think.

On the contrary, at least I may have to spent less time checking that 
intended optimizations are being done in ASM listings.

-- 
Dmitry Olshansky

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 3:00 PM, Dmitry Olshansky wrote:
 You actually going against yourself with this argument - for porting you
 typically suggest:

 version(OS1)
   ...
 else version(OS2)
   ...
 else
 static assert(0);

There's not much choice about that. I also suggest moving such code into 
separate modules.


 Your nice tired and true way of doing things is EQUALLY FRAGILE (if not more)
 and highly coupled to the compiler but only SILENTLY so.

That's very true. Do you suggest the compiler emit a list of what optimizations 
it did or did not do? What makes inlining special, as opposed to, say, 
enregistering particular variables?

Feb 23 2014

"Mike" <none none.com> writes:

On Sunday, 23 February 2014 at 23:49:57 UTC, Walter Bright wrote:
 What makes inlining special, as opposed to, say, enregistering 
 particular variables?

The difference is it was explicitly told do do something and 
didn't.  That's insubordination.

Mike

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 3:55 PM, Mike wrote:
 The difference is it was explicitly told do do something and didn't.  That's
 insubordination.

I view this as more in the manner of providing the equivalent of runtime 
profiling information to the optimizer, in indirectly saying how often a 
function is executed.

Optimizing is a rather complicated process, and particular optimizations very 
often have weird and unpredictable interactions with other optimizations.

For example, in the olden days, C compilers had a 'register' storage class. 
Optimizers' register allocation strategy was so primitive it needed help. Over 
time, however, it became apparent that uses of 'register' became bit-rotted due 
to maintenance, resulting in all the wrong variables being enregistered. 
Compiler register allocation got a lot better, almost always being better than 
the users'. Not only that, but with generic code, and optimization rewrites of 
code, many variables would disappear and new ones would take their place. 
Different CPUs needed different register allocation strategies. What to do with 
'register' then?

The result was compilers began to take the 'register' as a hint, and eventually 
moved to totally ignoring 'register', as it turned out to be a pessimization.

I suspect that elevating one particular optimization hint to being an absolute 
command may not turn out well. Inlining already has performance issues, as it 
may increase the size of an inner loop beyond what will fit in the cache, for 
just one unexpected result. For another it may mess up the register allocation 
of the caller. "Inlining makes it faster" is not always true. Do you really
want 
to weld this in as an absolute requirement in the language?

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

On Monday, 24 February 2014 at 00:33:09 UTC, Walter Bright wrote:
 I suspect that elevating one particular optimization hint to 
 being an absolute command may not turn out well. Inlining 
 already has performance issues, as it may increase the size of 
 an inner loop beyond what will fit in the cache, for just one 
 unexpected result. For another it may mess up the register 
 allocation of the caller. "Inlining makes it faster" is not 
 always true. Do you really want to weld this in as an absolute 
 requirement in the language?

The fact that original C "inline" was designed in same 
"permissive" way and is almost unused in practice (as opposed to 
compiler-specific force_inline attributes) does say something.

It is not feature that should be design for mass usage.

Feb 23 2014

"Araq" <rumpf_a web.de> writes:

 The fact that original C "inline" was designed in same 
 "permissive" way and is almost unused in practice (as opposed 
 to compiler-specific force_inline attributes) does say 
 something.

Do you mind to back up your "fact" with some numbers? Afaict 
'inline' is more common than __attribute__((forceinline)). (Well 
ok for C code #define is even more common, but most C code is 
stuck in the 70ies anyway so that doesn't mean anything.)

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

On Monday, 24 February 2014 at 01:09:46 UTC, Araq wrote:
 Do you mind to back up your "fact" with some numbers? Afaict 
 'inline' is more common than __attribute__((forceinline)). 
 (Well ok for C code #define is even more common, but most C 
 code is stuck in the 70ies anyway so that doesn't mean 
 anything.)

I can't link you closed projects I have been working on before so 
you can surely not trust my memories. Normal `inline` is common 
in headers because you can't have non-inlined function bodies in 
headers. In actual translation units - only from those who 
actually expect it to have forceinline effect (I have not met a 
single case where adding it can make any difference on gcc 
decision to inline or not). This was my actual point - not that 
no one uses "inline" but that the very same lax definition has 
turned it into essentially into no-op, causing necessity for 
compiler-specific alternative to appear.

Feb 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/14, 4:33 PM, Walter Bright wrote:
 On 2/23/2014 3:55 PM, Mike wrote:
 The difference is it was explicitly told do do something and didn't.
 That's
 insubordination.

 I view this as more in the manner of providing the equivalent of runtime
 profiling information to the optimizer, in indirectly saying how often a
 function is executed.

 Optimizing is a rather complicated process, and particular optimizations
 very often have weird and unpredictable interactions with other
 optimizations.

 For example, in the olden days, C compilers had a 'register' storage
 class. Optimizers' register allocation strategy was so primitive it
 needed help. Over time, however, it became apparent that uses of
 'register' became bit-rotted due to maintenance, resulting in all the
 wrong variables being enregistered. Compiler register allocation got a
 lot better, almost always being better than the users'. Not only that,
 but with generic code, and optimization rewrites of code, many variables
 would disappear and new ones would take their place. Different CPUs
 needed different register allocation strategies. What to do with
 'register' then?

 The result was compilers began to take the 'register' as a hint, and
 eventually moved to totally ignoring 'register', as it turned out to be
 a pessimization.

 I suspect that elevating one particular optimization hint to being an
 absolute command may not turn out well. Inlining already has performance
 issues, as it may increase the size of an inner loop beyond what will
 fit in the cache, for just one unexpected result. For another it may
 mess up the register allocation of the caller. "Inlining makes it
 faster" is not always true. Do you really want to weld this in as an
 absolute requirement in the language?

I'll add an anecdote - in HHVM we owe a lot of speedups to the careful 
use of "never inline" and "always inline" gcc pragmas IN ADDITION TO the 
usual "inline" directives. We have factual proof that gcc makes the 
wrong inline decisions BOTH WAYS if left to decide.

If we define pragmas for inlining, "always inline" must mean always 
inline no questions asked and "never inline" must mean always prevent 
inlining no questions asked. Anything else would be a frustrating waste 
of time.


Andrei

Feb 23 2014

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Monday, 24 February 2014 at 04:14:08 UTC, Andrei Alexandrescu 
wrote:
 I'll add an anecdote - in HHVM we owe a lot of speedups to the 
 careful use of "never inline" and "always inline" gcc pragmas 
 IN ADDITION TO the usual "inline" directives. We have factual 
 proof that gcc makes the wrong inline decisions BOTH WAYS if 
 left to decide.

 If we define pragmas for inlining, "always inline" must mean 
 always inline no questions asked and "never inline" must mean 
 always prevent inlining no questions asked. Anything else would 
 be a frustrating waste of time.

I think there is another, distinct use case for an inline pragma 
where "try to inline" is useful - namely, turning on the 
equivalent of the compiler "-inline" switch for just one 
function. I believe this is the original rationale behind the DIP 
(enabling inlining for certain functions even in debug builds, 
because otherwise the debug builds become so slow as to be 
unusable). In this case, whether the compiler actually succeeds 
at inlining the function doesn't matter as long as it does the 
same thing as for an optimized (-inline) build.

Thus, I think there should be "try to inline" (same as -inline) 
and "always inline" (failure stops compilation).

Feb 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/14, 8:26 PM, Vladimir Panteleev wrote:
 On Monday, 24 February 2014 at 04:14:08 UTC, Andrei Alexandrescu wrote:
 I'll add an anecdote - in HHVM we owe a lot of speedups to the careful
 use of "never inline" and "always inline" gcc pragmas IN ADDITION TO
 the usual "inline" directives. We have factual proof that gcc makes
 the wrong inline decisions BOTH WAYS if left to decide.

 If we define pragmas for inlining, "always inline" must mean always
 inline no questions asked and "never inline" must mean always prevent
 inlining no questions asked. Anything else would be a frustrating
 waste of time.

 I think there is another, distinct use case for an inline pragma where
 "try to inline" is useful - namely, turning on the equivalent of the
 compiler "-inline" switch for just one function. I believe this is the
 original rationale behind the DIP (enabling inlining for certain
 functions even in debug builds, because otherwise the debug builds
 become so slow as to be unusable). In this case, whether the compiler
 actually succeeds at inlining the function doesn't matter as long as it
 does the same thing as for an optimized (-inline) build.

 Thus, I think there should be "try to inline" (same as -inline) and
 "always inline" (failure stops compilation).

Sounds fair enough.

Andrei

Feb 23 2014

Jerry <jlquinn optonline.net> writes:

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

 On 2/23/14, 8:26 PM, Vladimir Panteleev wrote:
 Thus, I think there should be "try to inline" (same as -inline) and
 "always inline" (failure stops compilation).

 Sounds fair enough.

pragma(inline, false);
pragma(inline, true);
pragma(inline, force);  // inline or die

How is that?

Feb 24 2014

"francesco cattoglio" <francesco.cattoglio gmail.com> writes:

On Monday, 24 February 2014 at 22:09:49 UTC, Jerry wrote:
 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

 On 2/23/14, 8:26 PM, Vladimir Panteleev wrote:
 Thus, I think there should be "try to inline" (same as 
 -inline) and
 "always inline" (failure stops compilation).

 Sounds fair enough.

 pragma(inline, false);
 pragma(inline, true);
 pragma(inline, force);  // inline or die

 How is that?

Personally I like it. Perhaps you forgot
pragma(inline, never);  // don't inline or die
but I honestly have no idea if this would actually be useful.

Anyway, I'm really fine if there will be no way to force inline. 
But if we can't guarantee that inlining actually happens, please 
change the pragma name.

Feb 25 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Feb-2014 04:33, Walter Bright пишет:
 On 2/23/2014 3:55 PM, Mike wrote:
 The difference is it was explicitly told do do something and didn't.
 That's
 insubordination.

 I view this as more in the manner of providing the equivalent of runtime
 profiling information to the optimizer, in indirectly saying how often a
 function is executed.

 Optimizing is a rather complicated process, and particular optimizations
 very often have weird and unpredictable interactions with other
 optimizations.

Speaking of other optimizations.

There is a thing called tail-call. Funnily enough compilers still 
consider it an optimization whereas in practice the difference usually 
means "stack overflow" vs "normal execution" for functional-style code. 
But I'd rather prefer we stay focused on one particular optimization here.

 For example, in the olden days, C compilers had a 'register' storage
 class. Optimizers' register allocation strategy was so primitive it
 needed help. Over time, however, it became apparent that uses of
 'register' became bit-rotted due to maintenance, resulting in all the
 wrong variables being enregistered. Compiler register allocation got a
 lot better, almost always being better than the users'.

When such a time the compiler can actually produce the best inlining 
decisions on its own these kind of options may become irrelevant.
However it may need to run profiler on relevant input to understand that 
and do it all by itself.

 Not only that,
 but with generic code, and optimization rewrites of code, many variables
 would disappear and new ones would take their place. Different CPUs
 needed different register allocation strategies. What to do with
 'register' then?

Indeed register was tied to something immaterial - a variable, whereas 
in fact there are plenty of temporaries and induction variables that a 
programmer can't label.

In contrast the generic code is functions upon functions passed through 
other tiny functions. This in part what makes inlining so special.

 The result was compilers began to take the 'register' as a hint, and
 eventually moved to totally ignoring 'register', as it turned out to be
 a pessimization.

 I suspect that elevating one particular optimization hint to being an
 absolute command may not turn out well. Inlining already has performance
 issues, as it may increase the size of an inner loop beyond what will
 fit in the cache, for just one unexpected result. For another it may
 mess up the register allocation of the caller.

"Inlining makes it
 faster" is not always true.

Like I'm a bloody idiot. But once your performance problem is (after 
perusing ASM) particular function not being inlined, dancing around 
compiler in the DARK until it strikes home (if ever) isn't a viable option.

And with DMD it's like 90% of cases my problem is some critical 
one-liner not being inlined. In contracts register allocation is mostly 
fine.
There are some marvelous codegen gems though:
https://d.puremagic.com/issues/show_bug.cgi?id=10932
where compiler moves from ebx to edx via a stack slot for no apparent 
reason.

 Do you really want to weld this in as an
 absolute requirement in the language?

Aye. That and explicit tail calls but that's a separate matter.
Experimental compilers may choose to issue warnings saying that they 
basically can't inline (yet or by design).

-- 
Dmitry Olshansky

Feb 24 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

24-Feb-2014 03:49, Walter Bright пишет:
 On 2/23/2014 3:00 PM, Dmitry Olshansky wrote:
 You actually going against yourself with this argument - for porting you
 typically suggest:

 version(OS1)
   ...
 else version(OS2)
   ...
 else
 static assert(0);

 There's not much choice about that. I also suggest moving such code into
 separate modules.


 Your nice tired and true way of doing things is EQUALLY FRAGILE (if
 not more)
 and highly coupled to the compiler but only SILENTLY so.

 That's very true. Do you suggest the compiler emit a list of what
 optimizations it did or did not do? What makes inlining special, as
 opposed to, say, enregistering particular variables?

GCC has these attributes (including flatten to fully unroll all calls in 
a function) for a good reason. Let's face the fact that compilers 
nowhere near perfect with decisions about inlining. Especially so when 
building libraries.

Inlining is special in the sense that compiler doesn't know (there is 
not a single hint today in D) if any particular function should be a 
part of object code (following the ABI and referenced elsewhere) or just 
a logical piece of code that is reused (using any convenient calling 
convention or inlined).

Let me turn the question sideways - what if no_inline will be a hint to 
compiler and it may feel free to inline the function anyway? Will you be 
happy with such a pragma? It's that simple - you either gain control, or 
stay with wishy-washy hopes.

As you said in contrast with register allocation (that is ridiculously 
hard problem) later with time it turned out that trying to pin outsmart 
the compiler is something people were not good at in general.

-- 
Dmitry Olshansky

Feb 24 2014

"Tove" <tove fransson.se> writes:

On Sunday, 23 February 2014 at 21:53:43 UTC, Walter Bright wrote:
 I'm aware of that, but once you add the:

     version(BadCompiler) { } else pragma(inline, true);

 things will never get better for BadCompiler. And besides, that 
 line looks awful.

If I need to support multiple compilers and if one of them is not 
good enough, I would first try to figure out which statement 
causes it to fail, if left with no other alternatives: Manually 
inline it in the common path for all compilers, _not_ create 
version blocks.

Inspecting asm output doesn't scale well to huge projects. 
Imagine simply updating the existing codebase to use a new 
compiler version.

Based on my experience, even if we are profiling and benchmarking 
a lot and have many performance based KPI:s, they will still 
never be as fine-grained as the functional test coverage.

Also not forgetting, some performance issues may only be detected 
in live usage scenarios on the other side of the earth as the 
developers doesn't even have access to the needed 
environment(only imperfect simulations), in those scenarios you 
are quite grateful for every static compilation error/warning you 
can get...

You are right in that there is nothing special about inlining, 
but I'd rather add warnings for all other failed optimisation 
opportunities than not to warn about failed inlining. RVCT for 
instance has --diag_warning=optimizations, which gives many 
helpful hints, such as alias issues: please add "restrict", or 
possible alignment issues etc.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:21 PM, Tove wrote:
 Inspecting asm output doesn't scale well to huge projects. Imagine simply
 updating the existing codebase to use a new compiler version.

Again, this is treating 'inline' as being the only optimization that matters? 
It's not even the most important - that would likely be register allocation.

At some point, you're going to need to trust the compiler.


 You are right in that there is nothing special about inlining, but I'd rather
 add warnings for all other failed optimisation opportunities than not to warn
 about failed inlining. RVCT for instance has --diag_warning=optimizations,
which
 gives many helpful hints, such as alias issues: please add "restrict", or
 possible alignment issues etc.

There are *thousands* of optimization patterns. Logging which ones were applied 
to each expression node would be utterly useless to anyone but a compiler 
writer. (You can turn this on in debug builds of the compiler and see for
yourself.)

The most effective log is to look at the asm output. There isn't a substitute.
I 
know that doesn't scale, going back to my point that at some point you're going 
to have to spot check here and there and otherwise trust the compiler.

I know that most programmers don't want to look at the asm output. Whether an 
error for failed inlining is or is not issued won't change the need to have a 
look now and then, if you want your code to be the fastest it can be.

BTW, although the DIP says the compiler can ignore it, in practice there aren't 
going to be perverse compilers. Compiler writers want their compilers to be 
useful, and don't go out of their way to sneakily interpret the spec to do as 
bad a job as possible. Conversely, the history of programmer-supplied optimizer 
edicts (see 'register') is not a very good one, as programmers are often not 
terribly cognizant of the tradeoffs and tend to use overly simplistic rules
when 
applying these edicts. As optimizers improve, they shouldn't be impeded by 
well-intentioned but wrong optimization edicts.

(An early version of my C compiler had a long list of various optimization 
strategies that could be turned on/off. Never once was any appropriate use made 
of these. It's why dmd has evolved to simply have -O. -inline is a separate 
switch for reasons of symbolic debuggability.)

Feb 23 2014

Brad Roberts <braddr puremagic.com> writes:

On 2/23/14, 5:05 PM, Walter Bright wrote:
 On 2/23/2014 4:21 PM, Tove wrote:
 Inspecting asm output doesn't scale well to huge projects. Imagine simply
 updating the existing codebase to use a new compiler version.

 Again, this is treating 'inline' as being the only optimization that
 matters? It's not even the most important - that would likely be
 register allocation.

 At some point, you're going to need to trust the compiler.

At this point, you're starting to argue that the entire DIP isn't 
relevant.  I agree with the majority that if you're going to have the 
directive, then it needs to be enforcement, not suggestion.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 5:45 PM, Brad Roberts wrote:
 At this point, you're starting to argue that the entire DIP isn't relevant.  I
 agree with the majority that if you're going to have the directive, then it
 needs to be enforcement, not suggestion.

1. It provides information to the compiler about runtime frequency that it 
cannot obtain otherwise. This is very useful information for generating better
code.

2. Making it a hard requirement then means the user will have to put versioning 
in it. It becomes inherently non-portable. There is no way to predict what some 
other version of some other compiler on some other system will do.

3. In the end, the compiler should make the decision. Inlining does not always 
result in faster code, as I pointed out in another post.

4. I don't see that users really are asking for inlining or not. They are
asking 
for the fastest code. As such, providing hints about usage frequencies are 
entirely appropriate. Micromanaging the method used is not so appropriate.
After 
all, the reason one uses a compiler in the first place rather than assembler is 
to not micromanage the actual instructions.


Perhaps the lesson is the word 'inline' carries certain expectations with it, 
and the feature would be better positioned as something like:

     pragma(usage, often);
     pragma(usage, rare);

Feb 23 2014

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 23 Feb 2014 21:05:32 -0500, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 2/23/2014 5:45 PM, Brad Roberts wrote:
 At this point, you're starting to argue that the entire DIP isn't  
 relevant.  I
 agree with the majority that if you're going to have the directive,  
 then it
 needs to be enforcement, not suggestion.

 1. It provides information to the compiler about runtime frequency that  
 it cannot obtain otherwise. This is very useful information for  
 generating better code.

But you are under-utilizing the message. There is the case that one wants  
inlining, even when -inline isn't passed to the compiler, for functions  
that would have been inlined if -inline was specified. That is your case,  
right?

But there is a case where the compiler for some reason has decided that  
inlining a function is not worth it, so even with -inline it doesn't do  
it. However, without the inlining, the function becomes horrendously slow.  
For example, functions that contain lazy parameters.

 2. Making it a hard requirement then means the user will have to put  
 versioning in it. It becomes inherently non-portable. There is no way to  
 predict what some other version of some other compiler on some other  
 system will do.

This is not a problem. The whole point is, if the compiler doesn't support  
the inlining, the code is useless. I WANT it to fail, there is no reason  
to version it out.

 3. In the end, the compiler should make the decision. Inlining does not  
 always result in faster code, as I pointed out in another post.

Huh? Then why even have the feature if the compiler is going to ignore  
your request!

This feature sounds completely useless to me, it certainly adds no real  
value that warrants adding a pragma. It may as well be called  
pragma(please_inline_pretty_pretty_please_ill_be_your_best_friend)

 4. I don't see that users really are asking for inlining or not. They  
 are asking for the fastest code. As such, providing hints about usage  
 frequencies are entirely appropriate. Micromanaging the method used is  
 not so appropriate. After all, the reason one uses a compiler in the  
 first place rather than assembler is to not micromanage the actual  
 instructions.

Compilers are not infallible. They may make mistakes, or not have enough  
information, which is the point of this feature. What is to say they don't  
make mistakes even with the correct amount of information?

And the reason I use a compiler rather than assembler is because I hate  
writing assembler :)

 Perhaps the lesson is the word 'inline' carries certain expectations  
 with it, and the feature would be better positioned as something like:

      pragma(usage, often);
      pragma(usage, rare);

This is totally the wrong tack. First, I may have no idea how often a  
function will be used. Second, usage frequency has nothing to do with how  
inlining may affect the performance of an individual call. If an inlined  
function always executes faster than calling the function, I always want  
to inline.

For example, foo:

void foo(ref int x)
{
   ++x;
}

-Steve

Feb 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/14, 6:05 PM, Walter Bright wrote:
 4. I don't see that users really are asking for inlining or not. They
 are asking for the fastest code. As such, providing hints about usage
 frequencies are entirely appropriate. Micromanaging the method used is
 not so appropriate. After all, the reason one uses a compiler in the
 first place rather than assembler is to not micromanage the actual
 instructions.

In HHVM we plainly ask for specific decisions on inlining or not. We 
have a reasonably good understanding of how and where our code has 
trouble with ICache misses, and adjust our inline decisions and validate 
using experiments.

A decision to force inlining or against it already indicates a failure 
of the compiler's heuristics to address the situation. Keeping it an 
option is insisting on failing.

 Perhaps the lesson is the word 'inline' carries certain expectations
 with it, and the feature would be better positioned as something like:

      pragma(usage, often);
      pragma(usage, rare);

That's an interesting unrelated idea. But if we defined pragmas to 
"force inline" and "never inline" we must damn sure make sure the 
compiler always does that. It's "listen to your customers" as plainly as 
it gets.


Andrei

Feb 23 2014

"francesco cattoglio" <francesco.cattoglio gmail.com> writes:

On Monday, 24 February 2014 at 02:05:31 UTC, Walter Bright wrote:
 1. It provides information to the compiler about runtime 
 frequency that it cannot obtain otherwise. This is very useful 
 information for generating better code.

This answers to your own previous question: this is what makes 
"inline" a special optimization.

 3. In the end, the compiler should make the decision. Inlining 
 does not always result in faster code, as I pointed out in 
 another post.

Honestrly, in the small profiling I've done in my life, at least 
inlining never made my code slower. But I do realize this is not 
relevant to the discussion.

 Perhaps the lesson is the word 'inline' carries certain 
 expectations with it, and the feature would be better 
 positioned as something like:

     pragma(usage, often);
     pragma(usage, rare);

Yes, I think "inline" carries huge expectations: the expectation 
for the compiler to comply. If the plan is hinting frequency 
information, then "usage" makes way more sense. It might be used 
in if blocks and in switch cases too, when branch prediction 
might be sloppy or unoptimal.

Feb 23 2014

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Feb 24, 2014 2:10 AM, "Walter Bright" <newshound2 digitalmars.com> wrote:
 On 2/23/2014 5:45 PM, Brad Roberts wrote:
 At this point, you're starting to argue that the entire DIP isn't


relevant.  I
 agree with the majority that if you're going to have the directive, then


it
 needs to be enforcement, not suggestion.


 1. It provides information to the compiler about runtime frequency that

it cannot obtain otherwise. This is very useful information for generating
better code.
 2. Making it a hard requirement then means the user will have to put

versioning in it. It becomes inherently non-portable. There is no way to
predict what some other version of some other compiler on some other system
will do.
 3. In the end, the compiler should make the decision. Inlining does not

always result in faster code, as I pointed out in another post.
 4. I don't see that users really are asking for inlining or not. They are

asking for the fastest code. As such, providing hints about usage
frequencies are entirely appropriate. Micromanaging the method used is not
so appropriate. After all, the reason one uses a compiler in the first
place rather than assembler is to not micromanage the actual instructions.
 Perhaps the lesson is the word 'inline' carries certain expectations with

it, and the feature would be better positioned as something like:
     pragma(usage, often);
     pragma(usage, rare);

Also known as, hot and cold functions.

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Feb 23 2014

"Dicebot" <public dicebot.lv> writes:

On Monday, 24 February 2014 at 02:05:31 UTC, Walter Bright wrote:
     pragma(usage, often);
     pragma(usage, rare);

This is also useful feature, especially when also applicable to 
if branches  (I have been using __builtin_expect quite a lot with 
GCC). But it is different, I think we need both.

Feb 24 2014

"ponce" <contact gam3sfrommars.fr> writes:

On Monday, 24 February 2014 at 02:05:31 UTC, Walter Bright wrote:
 1. It provides information to the compiler about runtime 
 frequency that it cannot obtain otherwise. This is very useful 
 information for generating better code.

 2. Making it a hard requirement then means the user will have 
 to put versioning in it. It becomes inherently non-portable. 
 There is no way to predict what some other version of some 
 other compiler on some other system will do.

I'm not sure what it is impossible to inline in some case, I've 
never hit that limitation with ICC.
Like others I would like unconditional and explicit optimization 
from the compiler.

 3. In the end, the compiler should make the decision. Inlining 
 does not always result in faster code, as I pointed out in 
 another post.

Also when I use "force inline" it's very often to force 
"not-inline" to reuse the same bit of code while the compiler 
would have inlined it.

Each optimization here is taken a repeatable automated A-B test 
with a 95% statistical significance on various inputs, and 
forcing inline/not-inline has been an effective tool to reduce 
the I-cache stress that plagues some very particular program 
areas that the compiler doesn't differentiate. This can be 
checked by looking at assembly or binary size afterwards.

I'm perfectly OK with the compiler doing what he wants when I 
don't tell it to inline or not. AFAIK the C/C++ inline keyword is 
mostly ignored by optimizing compilers, it's precisely a keyword 
that is both overused and meaningless.


 Perhaps the lesson is the word 'inline' carries certain 
 expectations with it, and the feature would be better 
 positioned as something like:

     pragma(usage, often);
     pragma(usage, rare);

To me it's not so much about usage frequency that about I-cache 
misses. Some inlining can be nearly free (I-cache working set 
small), or very costly (I-cache actively being the bottleneck 
through repeated miss due to large working set).

Feb 24 2014

Manu <turkeyman gmail.com> writes:

On 24 February 2014 07:53, Walter Bright <newshound2 digitalmars.com> wrote:

  With error - yo get a huge advantage - an _instant_ feedback that it
 doesn't do
 what you want it to do. Otherwise it gets the extra pleasure of running
 disassembler to pinpoint your favorite call sites or observing that your
 profiler shows the same awful stats.

 My point is you're going to have to look at the asm of the top functions
 on the profiler stats anyway, or you're wasting your time trying to
 optimize the code. (Speaking from considerable experience doing that.)
 There's a heluva lot more to optimizing effectively than inlining, and it
 takes some back-and-forth tweaking source code and looking at the
 assembler. I gave some examples of that above.

For those interested, in my experience, the value of inlining is rarely
related to eliminating the cost of the function call. call and ret have
virtually no impact on performance on any architecture I've used.
The main value is that it eliminates stuffing around with parameter lists,
and managing save registers. Also, some argument types can't pass in
registers, which means they pass through memory, and memory access should
be treated no differently from the hard drive in realtime code ;) .. The
worst case is a write followed by an immediate read (non-register argument,
or save register value); some architectures stall waiting for the full
flush before they can read it back. It's called a Load-Hit-Store hazard,
and it's the most expensive low level hazard short of an L2 miss.
But the most important use by far is that you can control which functions
are leaf functions. Leaf functions (functions that don't allocate a stack
frame at all) are critical for good performance. Any small helper functions
you call MUST be inlined, or your function is no longer eligible to be a
leaf function.

I agree that inline should be a hint (a STRONG hint, not like 'inline' in
C, more like __force_inline, perhaps stronger), but I'd like it if I
received a warning when it failed for whatever reason. I don't want it to
stop compiling, but a nice notification that I should look into it, and the
ability to disable/silence the warning if I can't/don't intend to.

Feb 24 2014

"Kapps" <opantm2+spam gmail.com> writes:

On Monday, 24 February 2014 at 16:58:21 UTC, Manu wrote:
 I agree that inline should be a hint (a STRONG hint, not like 
 'inline' in
 C, more like __force_inline, perhaps stronger), but I'd like it 
 if I
 received a warning when it failed for whatever reason. I don't 
 want it to
 stop compiling, but a nice notification that I should look into 
 it, and the
 ability to disable/silence the warning if I can't/don't intend 
 to.

Perhaps something like a -vinline similar to -vtls? You don't
need to be spammed repeatedly every time you build saying
something isn't inlined, yet this still gives an easy way of
seeing which methods you requested to be inlined that were not.
The flag would display only functions marked with pragma(inline,
true).

Feb 24 2014

"Dicebot" <public dicebot.lv> writes:

On Monday, 24 February 2014 at 18:00:39 UTC, Kapps wrote:
 Perhaps something like a -vinline similar to -vtls? You don't
 need to be spammed repeatedly every time you build saying
 something isn't inlined, yet this still gives an easy way of
 seeing which methods you requested to be inlined that were not.
 The flag would display only functions marked with pragma(inline,
 true).

As I have already mentioned in this thread, there already does 
exist pull request to add flag to print inlining diagnostics. It 
can be re-used once merged.

Feb 24 2014

"deadalnix" <deadalnix gmail.com> writes:

On Monday, 24 February 2014 at 16:58:21 UTC, Manu wrote:
 For those interested, in my experience, the value of inlining 
 is rarely
 related to eliminating the cost of the function call. call and 
 ret have
 virtually no impact on performance on any architecture I've 
 used.

It highly depends on the architecture you run on. X86 is
astonishingly good at this.

 The main value is that it eliminates stuffing around with 
 parameter lists,
 and managing save registers. Also, some argument types can't 
 pass in
 registers, which means they pass through memory, and memory 
 access should
 be treated no differently from the hard drive in realtime code 
 ;) .. The
 worst case is a write followed by an immediate read 
 (non-register argument,
 or save register value); some architectures stall waiting for 
 the full
 flush before they can read it back. It's called a 
 Load-Hit-Store hazard,
 and it's the most expensive low level hazard short of an L2 
 miss.

All modern architecture (if I put aside PIC) that I know of have
a store buffer to avoid this.

Also, not inlining prevent the compiler to do constant
propagation, and as such, prevent the compiler from doing a lot
of optimizations.

 I agree that inline should be a hint (a STRONG hint, not like 
 'inline' in
 C, more like __force_inline, perhaps stronger), but I'd like it 
 if I
 received a warning when it failed for whatever reason. I don't 
 want it to
 stop compiling, but a nice notification that I should look into 
 it, and the
 ability to disable/silence the warning if I can't/don't intend 
 to.

Proposed semantic:
Inline unless for some reason you cannot. If you cannot, warn
about it.

Feb 24 2014

"Tove" <tove fransson.se> writes:

On Sunday, 23 February 2014 at 12:57:00 UTC, Walter Bright wrote:
 On 2/23/2014 4:25 AM, Tove wrote:
 The DIP should probably specify what happens if inlining fails,
 i.e. generate a compilation error.

 I suspect that may cause problems, because different compilers 
 will have different inlining capabilities. I think it should be 
 a 'recommendation' to the compiler.

Would assert be feasible or difficult to implement with the 
current compiler design?

static assert(pragma(inline, true));

Feb 23 2014

Iain Buclaw <ibuclaw gdcproject.org> writes:

On 23 February 2014 14:19, Tove <tove fransson.se> wrote:
 On Sunday, 23 February 2014 at 12:57:00 UTC, Walter Bright wrote:
 On 2/23/2014 4:25 AM, Tove wrote:
 The DIP should probably specify what happens if inlining fails,
 i.e. generate a compilation error.


 I suspect that may cause problems, because different compilers will have
 different inlining capabilities. I think it should be a 'recommendation' to
 the compiler.


 Would assert be feasible or difficult to implement with the current compiler
 design?

 static assert(pragma(inline, true));

WAT!

Feb 23 2014

Manu <turkeyman gmail.com> writes:

On 23 February 2014 22:57, Walter Bright <newshound2 digitalmars.com> wrote:

 On 2/23/2014 4:25 AM, Tove wrote:

 The DIP should probably specify what happens if inlining fails,
 i.e. generate a compilation error.

 I suspect that may cause problems, because different compilers will have
 different inlining capabilities. I think it should be a 'recommendation' to
 the compiler.

Does this depend how it is implemented?
Will DMD just patch it directly into the AST like a mixin in the front end,
or is it always left to the back end?

Feb 24 2014

"Andrej Mitrovic" <andrej.mitrovich gmail.com> writes:

On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never 
 inlining. This DIP proposes a simple solution.

What if you want to mark a series of functions to be inlined? 
E.g. in an entire module:

-----
module fast;

// ??
pragma(inline, true):

Vec vecSum();
Vec vecMul();
-----

Seems like a solution would be preferred where this can be used 
for multiple functions. A UDA/ property of some sort.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:31 AM, Andrej Mitrovic wrote:
 What if you want to mark a series of functions to be inlined? E.g. in an entire
 module:

 -----
 module fast;

 // ??
 pragma(inline, true):

 Vec vecSum();
 Vec vecMul();
 -----

That can work because pragmas can have blocks associated with them.

Feb 23 2014

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

23-Feb-2014 16:07, Walter Bright пишет:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

Why pragma? Also how exactly it is supposed to work:

pragma(inline, true);
... //every declaration that follows is forcibly inlined?
pragma(inline, false);
... //every declaration that follows is forcibly NOT inlined?

How to return to normal state then? I think pragma is not attached to 
declaration.

I'd strongly favor introducing a compiler-hint family of UDAs and 
force_inline/force_notinline as first among many.

-- 
Dmitry Olshansky

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:38 AM, Dmitry Olshansky wrote:
 Why pragma?

Answered in another post.


 Also how exactly it is supposed to work:

T func(args)
{
     ...
     pragma(inline, true);
     ...
}


 How to return to normal state then?

Not necessary when it's inside a function.


 I'd strongly favor introducing a compiler-hint family of UDAs and
 force_inline/force_notinline as first among many.

I don't see an advantage of that over pragma. It also seems like something that 
should be inside a function, not outside. (After all, a function with no body 
cannot be inlined.)

Feb 23 2014

"Joseph Cassman" <jc7919 outlook.com> writes:

On Sunday, 23 February 2014 at 12:50:58 UTC, Walter Bright wrote:
 On 2/23/2014 4:38 AM, Dmitry Olshansky wrote:
 Why pragma?

 Answered in another post.


 Also how exactly it is supposed to work:

 T func(args)
 {
     ...
     pragma(inline, true);
     ...
 }


 How to return to normal state then?

 Not necessary when it's inside a function.


 I'd strongly favor introducing a compiler-hint family of UDAs 
 and
 force_inline/force_notinline as first among many.

 I don't see an advantage of that over pragma. It also seems 
 like something that should be inside a function, not outside. 
 (After all, a function with no body cannot be inlined.)

Thanks for the code example. That helped me better understand 
what is being proposed.

I like the idea of using pragma since it is built specifically 
for the purpose of sending information to the compiler from code. 
Also, I like not having to add another keyword to a function 
definition. Especially since I already have " safe pure nothrow" 
in as many places as possible, for inline-able functions I'd 
prefer to not have to add "inline" to that list. Using a pragma 
would mean it could be implemented right away without worrying 
about breaking any existing code. The proposal also satisfies the 
needs of both parties. Especially since D is a flexible language 
it would be nice to give such ability to customize code 
generation to the programmer.

Given the above I think this is a good idea.

Joseph

Feb 23 2014

dennis luehring <dl.soluz gmx.net> writes:

Am 23.02.2014 13:38, schrieb Dmitry Olshansky:
 23-Feb-2014 16:07, Walter Bright пишет:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

 Why pragma? Also how exactly it is supposed to work:

 pragma(inline, true);
 ... //every declaration that follows is forcibly inlined?
 pragma(inline, false);
 ... //every declaration that follows is forcibly NOT inlined?

 How to return to normal state then? I think pragma is not attached to
 declaration.

 I'd strongly favor introducing a compiler-hint family of UDAs and
 force_inline/force_notinline as first among many.

yea it feels strange - like naked in inline asm
its a scope changer - that sits inside the scope it changes???

like writing public methods by putting public inside of the method - and 
public is also compiler relevant for the generated interface

and aligne is also not a pragma - and still changes codegeneration

its a function-(compile-)attribute but that does not mean it have to
be a pragma

btw: is the pragma way just easier to implement - or else i don't 
understand why this is handle so special?

Feb 23 2014

"ponce" <contact gam3sfrommars.fr> writes:

On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never 
 inlining. This DIP proposes a simple solution.

This is great. I bet this will be useful.

I tend to prefer force-inline/force-not-inline at call site, but 
realized the proposal will let me do it:

void myFun(bool inlined)(int arg)
{
     static if (inlined)
         pragma(inline, true);
     else
         pragma(inline, false);
}

Then inlining can be entirely explicit :)

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 4:53 AM, ponce wrote:
 On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This DIP
 proposes a simple solution.

 This is great. I bet this will be useful.

 I tend to prefer force-inline/force-not-inline at call site, but realized the
 proposal will let me do it:

 void myFun(bool inlined)(int arg)
 {
      static if (inlined)
          pragma(inline, true);
      else
          pragma(inline, false);
 }

 Then inlining can be entirely explicit :)

Or better:

void myFun(bool inlined)(int arg)
{
     pragma(inline, inlined);
}

:-)

Feb 23 2014

Manu <turkeyman gmail.com> writes:

On 23 February 2014 22:55, Walter Bright <newshound2 digitalmars.com> wrote:

 On 2/23/2014 4:53 AM, ponce wrote:

 On Sunday, 23 February 2014 at 12:07:40 UTC, Walter Bright wrote:

 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This DIP
 proposes a simple solution.

 This is great. I bet this will be useful.

 I tend to prefer force-inline/force-not-inline at call site, but realized
 the
 proposal will let me do it:

 void myFun(bool inlined)(int arg)
 {
      static if (inlined)
          pragma(inline, true);
      else
          pragma(inline, false);
 }

 Then inlining can be entirely explicit :)

 Or better:

 void myFun(bool inlined)(int arg)
 {
     pragma(inline, inlined);
 }

 :-)

Really? I think you're just trying to be different for the sake of being
different :P

Feb 24 2014

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On 23/02/2014 13:07, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This DIP
 proposes a simple solution.

Sounds good in principle.  So, if I understand right, a pragma(inline, true) 
anywhere inside a function adds a compiler hint to always inline this function, 
while with false it's a hint to _never_ do so, and no pragma at all gives the 
usual compiler-decides situation?

Question: what happens if someone is daft enough to put both true and false 
inside the same function?

In any case, could you possibly provide a slightly more detailed code example 
with accompanying explanation of what the intended results are?

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 5:06 AM, Joseph Rushton Wakeling wrote:
 So, if I understand right, a pragma(inline, true)
 anywhere inside a function adds a compiler hint to always inline this function,
 while with false it's a hint to _never_ do so, and no pragma at all gives the
 usual compiler-decides situation?

I'll add:

     pragma(inline);

meaning revert to default behavior.


 Question: what happens if someone is daft enough to put both true and false
 inside the same function?

The last one wins.

Feb 23 2014

"Andrej Mitrovic" <andrej.mitrovich gmail.com> writes:

On Sunday, 23 February 2014 at 20:29:19 UTC, Walter Bright wrote:
 I'll add:

     pragma(inline);

That's just going to confuse people, because they'll think *this* 
forces inlining.

I'd prefer 3 separate states. pragma(inline), pragma(no_inline), 
and pragma(default_inline) or something like that.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 12:31 PM, Andrej Mitrovic wrote:
 On Sunday, 23 February 2014 at 20:29:19 UTC, Walter Bright wrote:
 I'll add:

     pragma(inline);

 That's just going to confuse people, because they'll think *this* forces
inlining.

Perhaps, but there's precedent with how align works, and how default 
initialization of variables works.


 I'd prefer 3 separate states. pragma(inline), pragma(no_inline), and
 pragma(default_inline) or something like that.

That makes documentation with a sorted list of pragmas impractical.

Feb 23 2014

"tn" <no email.com> writes:

On Sunday, 23 February 2014 at 21:38:46 UTC, Walter Bright wrote:
 On 2/23/2014 12:31 PM, Andrej Mitrovic wrote:
 I'd prefer 3 separate states. pragma(inline), 
 pragma(no_inline), and
 pragma(default_inline) or something like that.

 That makes documentation with a sorted list of pragmas 
 impractical.

pragma(inline_always)
pragma(inline_never)
pragma(inline_default)

or

pragma(inline_force)
pragma(inline_prevent)
pragma(inline_default)

Feb 23 2014

Xavier Bigand <flamaros.xavier gmail.com> writes:

Le 23/02/2014 13:07, Walter Bright a écrit :
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

I saw many times C++ developers works on applications doesn't need such 
level optimization puts inline keyword or implementation in headers 
files without doing any performance analysis!!! And as I saw they don't 
know X86 neither,...
For those doesn't have necessary knowledge it's just counter productive 
and increase the compilation times without evidence of the interest.

So my point is this kind of feature have to be hidden from newbies (like 
me) and other developers who are zealous.

Feb 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/14, 4:07 AM, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

This makes inlining dependent on previously-seen code. Would that make 
parallel compilation more difficult?

I've always thought the obvious/simple way would be an attribute such as 
 forceinline and  noinline that applies to individual functions.


Andrei

Feb 23 2014

"Meta" <jared771 gmail.com> writes:

On Monday, 24 February 2014 at 01:12:56 UTC, Andrei Alexandrescu 
wrote:
 This makes inlining dependent on previously-seen code. Would 
 that make parallel compilation more difficult?

 I've always thought the obvious/simple way would be an 
 attribute such as  forceinline and  noinline that applies to 
 individual functions.


 Andrei

That seems to be how Rust does it, but I'm not really clear how 
attributes work in Rust.

http://static.rust-lang.org/doc/master/rust.html#inline-attributes

Feb 23 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Andrei Alexandrescu:

 I've always thought the obvious/simple way would be an 
 attribute such as  forceinline and  noinline that applies to 
 individual functions.

Seems good. And what do you think the D compiler should do when 
you use  forceinline and it can't inline?

Bye,
bearophile

Feb 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/14, 5:55 PM, bearophile wrote:
 Andrei Alexandrescu:

 I've always thought the obvious/simple way would be an attribute such
 as  forceinline and  noinline that applies to individual functions.

 Seems good. And what do you think the D compiler should do when you use
  forceinline and it can't inline?

Compile-time error, no two ways about it.

Andrei

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 5:12 PM, Andrei Alexandrescu wrote:
 This makes inlining dependent on previously-seen code. Would that make parallel
 compilation more difficult?

I don't understand the question. Inlining always depends on the compiler having 
seen the function body.

 I've always thought the obvious/simple way would be an attribute such as
  forceinline and  noinline that applies to individual functions.

Since inlining can't be done without the function body, putting the pragma in 
the function body makes sense.

Feb 23 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/23/14, 6:12 PM, Walter Bright wrote:
 On 2/23/2014 5:12 PM, Andrei Alexandrescu wrote:
 This makes inlining dependent on previously-seen code. Would that make
 parallel
 compilation more difficult?

 I don't understand the question. Inlining always depends on the compiler
 having seen the function body.

Decision to inline at line 2000 may be caused by a pragma in line 2.

Andrei

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 8:18 PM, Andrei Alexandrescu wrote:
 On 2/23/14, 6:12 PM, Walter Bright wrote:
 On 2/23/2014 5:12 PM, Andrei Alexandrescu wrote:
 This makes inlining dependent on previously-seen code. Would that make
 parallel
 compilation more difficult?

 I don't understand the question. Inlining always depends on the compiler
 having seen the function body.

 Decision to inline at line 2000 may be caused by a pragma in line 2.

I still don't understand the question. Successfully compiling anything in D can 
have dependencies on arbitrary other parts of the code. Why would inlining be 
any different, or be a special problem?

Feb 24 2014

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/24/14, 12:55 AM, Walter Bright wrote:
 On 2/23/2014 8:18 PM, Andrei Alexandrescu wrote:
 On 2/23/14, 6:12 PM, Walter Bright wrote:
 On 2/23/2014 5:12 PM, Andrei Alexandrescu wrote:
 This makes inlining dependent on previously-seen code. Would that make
 parallel
 compilation more difficult?

 I don't understand the question. Inlining always depends on the compiler
 having seen the function body.

 Decision to inline at line 2000 may be caused by a pragma in line 2.

 I still don't understand the question. Successfully compiling anything
 in D can have dependencies on arbitrary other parts of the code. Why
 would inlining be any different, or be a special problem?

Probably it makes no difference, sorry for the distraction.

Andrei

Feb 24 2014

Iain Buclaw <ibuclaw gdcproject.org> writes:

On Feb 24, 2014 1:15 AM, "Andrei Alexandrescu" <
SeeWebsiteForEmail erdani.org> wrote:
 On 2/23/14, 4:07 AM, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.


 This makes inlining dependent on previously-seen code. Would that make

parallel compilation more difficult?
 I've always thought the obvious/simple way would be an attribute such as

 forceinline and  noinline that applies to individual functions.
 Andrei

GDC already has both of these as a compiler extended attribute (need to
document these!!!)

import gcc.attribute;

 attribute("forceinline") ...

Being backend attributes, you can't enforce that these attributes actually
take effect in user code (no static asserts!) - but you have some guarantee
in that the backend will complain if it can't apply the attribute - this is
good because the compiler will always produce a better diagnostic than some
user static assert, always.

Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Feb 23 2014

Lionello Lunesu <lionello lunesu.remove.com> writes:

On 23/02/14 20:07, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

void A()
{
}

void B()
{
   pragma(inline, true) A();
}

void C()
{
   B();
}

Reading that code, I would guess that within B(), the call to A() would 
get inlined. Reading the DIP, it appears that the pragma controls 
whether B() gets inlined.

When the pragma is used outside of the scope at the function declaration 
it would work more like "inline" or "__inline" in C++, correct?

L.

Feb 23 2014

Walter Bright <newshound2 digitalmars.com> writes:

On 2/23/2014 6:12 PM, Lionello Lunesu wrote:
 On 23/02/14 20:07, Walter Bright wrote:
 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This
 DIP proposes a simple solution.

 void A()
 {
 }

 void B()
 {
    pragma(inline, true) A();

No. This would be:
      pragma(inline, true);
      A();
and then B() will be inlined when it is encountered.

 }

 void C()
 {
    B();
 }

 Reading that code, I would guess that within B(), the call to A() would get
 inlined. Reading the DIP, it appears that the pragma controls whether B() gets
 inlined.

 When the pragma is used outside of the scope at the function declaration it
 would work more like "inline" or "__inline" in C++, correct?

Yes.

Feb 24 2014

Manu <turkeyman gmail.com> writes:

This will probably do, but I still don't understand why not a function
attribute?

Will marking a function as inline notify the compiler that code should
never be emitted to object files for that function?

Perhaps OT:
I've been playing with ranges a lot recently, and std.algorithm and
friends, and I'm finding that using lambdas is real problem. They don't
reliably inline, and the optimiser seems to have problems on occasion even
when they do. (Perhaps they inline at the wrong stage?)
How can we have some guarantees about the inlining and inline-ability of
trivial lambda's?
I'm very concerned about the performance of debug code when using something
like filter!"condition", which results in a whole bunch of extra function
calls per loop iteration.
I raised a thread recently about the idea of adding an additional optional
argument to foreach to provide a filtering or termination condition, which
if implemented by the language would have no overhead cost. The suggestion
was to use filter!"", which sounds like a reasonable idea, but I'm really
worried about the performance implications of using library primitives that
produce a bunch of extra function calls on every loop cycle. I'm not sure
these are practical when used in sufficiently trivial loops. Imagine I'm
looping over a vertex array or an image or something, skipping over
transparent pixels, or something like that... millions of iterations
performing very trivial transformation, calling a bunch of functions every
cycle.

On 23 February 2014 22:07, Walter Bright <newshound2 digitalmars.com> wrote:

 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This DIP
 proposes a simple solution.

Feb 24 2014

Manu <turkeyman gmail.com> writes:

On 25 February 2014 02:30, Manu <turkeyman gmail.com> wrote:

 This will probably do, but I still don't understand why not a function
 attribute?

Note; GDC and LDC already have inline attributes. It's a pain in the arse
to use them though, since in D, we have no way to alias attributes and
can't do the typical C preprocessor tricks to insert appropriate attributes
for different compilers.
I'd strongly encourage considering making it an attribute for the reason
that all compilers could then share the same attribute, rather than
remaining fragmented as it is.


Will marking a function as inline notify the compiler that code should
 never be emitted to object files for that function?

 Perhaps OT:
 I've been playing with ranges a lot recently, and std.algorithm and
 friends, and I'm finding that using lambdas is real problem. They don't
 reliably inline, and the optimiser seems to have problems on occasion even
 when they do. (Perhaps they inline at the wrong stage?)
 How can we have some guarantees about the inlining and inline-ability of
 trivial lambda's?
 I'm very concerned about the performance of debug code when using
 something like filter!"condition", which results in a whole bunch of extra
 function calls per loop iteration.
 I raised a thread recently about the idea of adding an additional optional
 argument to foreach to provide a filtering or termination condition, which
 if implemented by the language would have no overhead cost. The suggestion
 was to use filter!"", which sounds like a reasonable idea, but I'm really
 worried about the performance implications of using library primitives that
 produce a bunch of extra function calls on every loop cycle. I'm not sure
 these are practical when used in sufficiently trivial loops. Imagine I'm
 looping over a vertex array or an image or something, skipping over
 transparent pixels, or something like that... millions of iterations
 performing very trivial transformation, calling a bunch of functions every
 cycle.

 On 23 February 2014 22:07, Walter Bright <newshound2 digitalmars.com>wrote:

 http://wiki.dlang.org/DIP56

 Manu has needed always inlining, and I've needed never inlining. This DIP
 proposes a simple solution.

Feb 24 2014

D Programming

C/C++ Programming

Other

digitalmars.D - DIP56 Provide pragma to control function inlining