digitalmars.D - Optimisation possibilities: current, future and enhancements

Cecil Ward (57/57) Aug 25 2016 I'm wondering if there are more opportunities in D for increased

Cauterite (9/9) Aug 25 2016 On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:

Marco Leise (11/16) Aug 29 2016 Just a note on where compiler technology stands right now:

Cauterite (11/12) Aug 25 2016 Instead of adding individual micro-optimisation features like

Cecil Ward (11/23) Aug 25 2016 One killer reason for me to use D rather than C or C++ would be

Cauterite (3/3) Aug 25 2016 On Thursday, 25 August 2016 at 12:27:20 UTC, Cecil Ward wrote:

Andrea Fontana (11/13) Aug 25 2016 Maybe it could be implemented as
Basile B. (37/46) Aug 25 2016 I'll add

kinke (33/33) Aug 25 2016 I found it hard to believe LDC generates such crappy code when

Cecil Ward (4/37) Aug 25 2016 I think that here the optimisation is only because LDC can “see”

Cecil Ward (3/8) Aug 25 2016 (Assuming LDC behaves like GDC. I'm unfamiliar with LDC, I'm

kinke (12/21) Aug 25 2016 You're right. The question is whether it pays off to optimize

Basile B. (30/33) Aug 25 2016 Yes that's right, there was an error in my script! What I've

ag0aep6g (2/26) Aug 25 2016 At least, foo needs to be `pure` for that.
kinke (7/15) Aug 25 2016 From my perspective, the problem with this example isn't missed

Basile B. (6/14) Aug 25 2016 You're too focused on the example itself (Let's find an non

kink (7/21) Aug 26 2016 I know that it's just an illustration. But I surely don't like

Timon Gehr (17/36) Aug 26 2016 It's not the compiler's business to judge coding style, also:

kink (15/32) Aug 26 2016 It's free to choose not to implement complex optimizations just

Chris Wright (30/32) Aug 26 2016 The optimizations being discussed are a result of purity guarantees that...

Basile B. (8/30) Aug 26 2016 To be honnest I would have expected another criticism against

Cecil Ward (8/46) Aug 25 2016 The problem of the non-caching of appropriate function calls is
Johan Engelen (20/23) Aug 26 2016 Struct method constness (as in your example) does not mean that

Meta (23/47) Aug 26 2016 Here's an example that doesn't even need to use debug statements,

Basile B. (5/55) Aug 26 2016 That's another story. Of course the optimization pass for this

Meta (4/8) Aug 26 2016 My point is that getN is strongly pure but is not referentially

Patrick Schluter (6/15) Aug 26 2016 getN() is not pure at all, it refers to an object (in C term)

Basile B. (16/77) Aug 26 2016 void foo(Bar bar)

Patrick Schluter (8/58) Aug 26 2016 getN() is not pure, simple as that (and an ideal compiler should

ag0aep6g (11/30) Aug 26 2016 You can rewrite that code like this:

Patrick Schluter (5/39) Aug 26 2016 Yes. The optimisation of removing the second call is only

ag0aep6g (5/9) Aug 26 2016 If setN is called or not does not affect getN's purity, though. You

Patrick Schluter (13/25) Aug 26 2016 Yeah, I was wrong, the function getN() is pure (but not const).

ag0aep6g (2/3) Aug 26 2016 How is it not const? It doesn't alter the object. setN is the non-const ...

Meta (2/66) Aug 26 2016 Not according to the D spec.

Cecil Ward (19/21) Aug 25 2016 Particular dream wish-list items of mine: some kind of mechanism

Cecil Ward (4/7) Aug 25 2016 (Should of course read

Cecil Ward <d cecilward.com> writes:

I'm wondering if there are more opportunities in D for increased 
optimization in compilers that have not been mined yet. I'm 
specifically interested in the possibilities of D over and above 
what is possible in C and C++ because of the characteristics of D 
or because of our freedom to change compared with the inertia in 
the C/C++ world.

A useful phrase I saw today: “declaration of intent given by the 
programmer to the compiler”.

As well as the opportunities that exist in D as it stands and as 
it is _used_ currently,
I wonder what could be achieved by enhancing the language or its 
usage patterns
with new semantic restrictive markings that could be implemented 
with varying degrees of disruptiveness (from zero, up to idiom 
bans or even minor grammar changes [gulp!]). New attributes or 
property markings have already been added, I believe, during the 
history of D, which fall into this category. I'm also thinking of 
things like pragmas, functions with magic names and so forth.

Examples from the wider world, for discussion, no guarantees they 
allow increased optimisation:

* In C, the “restrict” marker
* In C++, the mechanism that makes possible optimised 
move-constructors and a solution to temporary-object hell
* assert()’s possibilities: but only if it is native and not 
deleted too early in the compiler stack - guarantees of the truth 
of predicates and restriction of values to be in known ranges, 
just as compilers can exploit prior truth of if-statements. Same 
for static assert of course
* Contracts, invariants, pre- and postconditions’ many, many 
delicious possibilities. Ideally, they need to be published, at 
two extra levels: within the same module and globally so that 
callers even from other translation units who have only the 
prototype can have a richer spec to guide inlining with truly 
first-class optimisation
* Custom calling conventions
* Some C compilers have magic to allow the declaration of an ISR. 
Stretching the category of optimisation a bit perhaps, but 
certainly does aid optimisation in the widest sense, because you 
don't then have to have unnecessary extra function-calls just to 
bolt assembler to a routine in C
* Similarly, inter-language calling mechanisms in general
* GCC and LDC’s extended asm interfacing specs, constraints and 
other magic
* Non-return-function marking, first in GCC and then in C itself. 
(iirc. And in C++?)
* the GCC "__builtin_expect()" / "likely()" and "unlikely()" 
magic marker functions to aid branch-prediction, code layout, etc
* GCC’s “builtin_assume_aligned()” function
* The GCC -ffast-math switch allows (if I understand correctly) 
the compiler to assume there are no IEEE floating-point 
weirdnesses such as infinities, NaNs, denormals anywhere, or to 
assume that the programmer just doesn't care. What if there were 
a mechanism to give kind of control down to a much more 
fine-grained level? (Such as 
per-function/operator/block/expression/object/variable.)

Sorry it's such a paltry list. However discussion will I'm sure 
expand it.

Aug 25 2016

Cauterite <cauterite gmail.com> writes:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:

I long for the day we ditch signalling NaNs — they would surely 
prevent `-ffast-math` from being effective.

I have a couple more ideas, here's one of them:
- if a function is pure and called with constexpr parameters, the 
compiler could potentially execute that call in the CTFE engine 
(automatically), as part of the constant-folding phase I guess. 
Such a technique will hopefully one day be practical, once the 
CTFE engine's performance improves.

Aug 25 2016

Marco Leise <Marco.Leise gmx.de> writes:

Am Thu, 25 Aug 2016 11:45:40 +0000
schrieb Cauterite <cauterite gmail.com>:

 - if a function is pure and called with constexpr parameters, the 
 compiler could potentially execute that call in the CTFE engine 
 (automatically), as part of the constant-folding phase I guess. 
 Such a technique will hopefully one day be practical, once the 
 CTFE engine's performance improves.

Just a note on where compiler technology stands right now:
Const-folding after inlining will have this effect for small
pure functions. I've also seen GCC duplicate functions to
remove one of the arguments with a constant if it figured it
could optimize the function around that argument. This is
effectively the same as having a 2nd version of the function
that takes the run-time argument as a compile-time argument.

-- 
Marco

Aug 29 2016

Cauterite <cauterite gmail.com> writes:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
 * the GCC "__builtin_expect()"

Instead of adding individual micro-optimisation features like 
this, I'd be more interested in the potential for profile-guided 
optimisation (which *ideally* can make these micro-optimisation 
decisions automatically). Since DMD already has some framework in 
place to support code profiling, I suspect this is at least a 
feasible enhancement.

On the other hand, it might not be worth trying to play catch-up 
with existing PGO features in GCC/LLVM. If you're using PGO, 
you're probably already using these other backends for their more 
advanced static optimisers.

Aug 25 2016

Cecil Ward <d cecilward.com> writes:

On Thursday, 25 August 2016 at 11:55:08 UTC, Cauterite wrote:
 On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
 * the GCC "__builtin_expect()"

 Instead of adding individual micro-optimisation features like 
 this, I'd be more interested in the potential for 
 profile-guided optimisation (which *ideally* can make these 
 micro-optimisation decisions automatically). Since DMD already 
 has some framework in place to support code profiling, I 
 suspect this is at least a feasible enhancement.

 On the other hand, it might not be worth trying to play 
 catch-up with existing PGO features in GCC/LLVM. If you're 
 using PGO, you're probably already using these other backends 
 for their more advanced static optimisers.

One killer reason for me to use D rather than C or C++ would be 
if it either has or could be enhanced to have greater code 
optimisation possibilities. LTO isn't relevant here because it's 
equally applicable to other languages (in GCC at any rate, as I 
understand it). Aside from its own properties, D might have an 
advantage over C because its spec development could possibly be 
more agile, well, compared with C _standards_ anyway. GCC has 
kept innovating with non-standard features, to its credit. I 
think it's desirable for D not to fall _behind_ GCC's 
non-standard powerful and ingenious tricks.

Aug 25 2016

Cauterite <cauterite gmail.com> writes:

On Thursday, 25 August 2016 at 12:27:20 UTC, Cecil Ward wrote:

When I said GCC/LLVM I meant GDC(GNU D Compiler)/LDC(LLVM D 
Compiler). I might have caused some confusion there.

Aug 25 2016

Andrea Fontana <nospam example.com> writes:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
 * Non-return-function marking, first in GCC and then in C 
 itself. (iirc. And in C++?)

Maybe it could be implemented as

int blah()
out(result)
{
    assert(0);
}
body
{

}

instead of marking the function itself.

Aug 25 2016

Basile B. <b2.temp gmx.com> writes:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
 I'm wondering if there are more opportunities in D for 
 increased optimization in compilers that have not been mined 
 yet. I'm specifically interested in the possibilities of D over 
 and above what is possible in C and C++ because of the 
 characteristics of D or because of our freedom to change 
 compared with the inertia in the C/C++ world.

 [...]
 Sorry it's such a paltry list. However discussion will I'm sure 
 expand it.

I'll add

* create temporaries based on the const function attribute.

I don't know why but I believed that it was already the case. 
After disassembling a short test with DMD and LDMD2 it appears 
clearly that this is not true:

°°°°°°°°°°°°°°°°°°°°°°°°°°
struct Foo
{
     immutable _u = 8;
     int foo() const
     {
         return 8 * _u;
     }
}
int use(ref const(Foo) foo)
{
     return foo.foo() + foo.foo();
}
°°°°°°°°°°°°°°°°°°°°°°°°°°

disasm of use (LDC2 via LDMD2, -O -release)

0000000000402930h  sub rsp, 18h
0000000000402934h  mov qword ptr [rsp+10h], rdi
0000000000402939h  call 00000000004028F0h ; (Foo.foo)
000000000040293Eh  mov rdi, qword ptr [rsp+10h]
0000000000402943h  mov dword ptr [rsp+0Ch], eax
0000000000402947h  call 00000000004028F0h ; (Foo.foo)
000000000040294Ch  mov ecx, dword ptr [rsp+0Ch]
0000000000402950h  add ecx, eax
0000000000402952h  mov eax, ecx
0000000000402954h  add rsp, 18h
0000000000402958h  ret

But Foo.foo constness guarantees that Foo state is not modified. 
So the result of the first CALL could be cached in a temporary 
and reused instead of the second CALL. This would help for 
example in loops when a getter function is called to know the 
iteration count.

Aug 25 2016

kinke <noone nowhere.com> writes:

I found it hard to believe LDC generates such crappy code when 
optimizing. These are my results using LDC master on Win64 (`ldc2 
-O -release -output-s`):

struct Foo
{
     immutable _u = 8;
     int foo() const
     {
         return 8 * _u;
     }
}
int use(ref const(Foo) foo)
{
     return foo.foo() + foo.foo();
}

int main()
{
     Foo f;
     return use(f);
}


_D7current3Foo3fooMxFZi:
	movl	(%rcx), %eax
	shll	$3, %eax
	retq

_D7current3useFKxS7current3FooZi:
	movl	(%rcx), %eax
	shll	$4, %eax
	retq

_Dmain:
	movl	$128, %eax
	retq

Sure, Foo.foo() and use() could return a constant, but otherwise 
it can't get much better than this.

Aug 25 2016

Cecil Ward <d cecilward.com> writes:

On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
 I found it hard to believe LDC generates such crappy code when 
 optimizing. These are my results using LDC master on Win64 
 (`ldc2 -O -release -output-s`):

 struct Foo
 {
     immutable _u = 8;
     int foo() const
     {
         return 8 * _u;
     }
 }
 int use(ref const(Foo) foo)
 {
     return foo.foo() + foo.foo();
 }

 int main()
 {
     Foo f;
     return use(f);
 }


 _D7current3Foo3fooMxFZi:
 	movl	(%rcx), %eax
 	shll	$3, %eax
 	retq

 _D7current3useFKxS7current3FooZi:
 	movl	(%rcx), %eax
 	shll	$4, %eax
 	retq

 _Dmain:
 	movl	$128, %eax
 	retq

 Sure, Foo.foo() and use() could return a constant, but 
 otherwise it can't get much better than this.

I think that here the optimisation is only because LDC can “see” 
the text of the method. When expansion is not possible, that 
would be the real test.

Aug 25 2016

Cecil Ward <d cecilward.com> writes:

On Thursday, 25 August 2016 at 18:07:14 UTC, Cecil Ward wrote:
 On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
 [...]

 I think that here the optimisation is only because LDC can 
 “see” the text of the method. When expansion is not possible, 
 that would be the real test.

(Assuming LDC behaves like GDC. I'm unfamiliar with LDC, I'm 
ashamed to admit.)

Aug 25 2016

kinke <noone nowhere.com> writes:

On Thursday, 25 August 2016 at 18:09:14 UTC, Cecil Ward wrote:
 On Thursday, 25 August 2016 at 18:07:14 UTC, Cecil Ward wrote:
 On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
 [...]

 I think that here the optimisation is only because LDC can 
 “see” the text of the method. When expansion is not possible, 
 that would be the real test.

 (Assuming LDC behaves like GDC. I'm unfamiliar with LDC, I'm 
 ashamed to admit.)

You're right. The question is whether it pays off to optimize 
heavily for externals. If you build all modules of a binary at 
once via `ldmd2 m1.d m2.d ...` or via `ldc2 -singleobj m1.d m2.d 
...`, LDC emits all the code into a single LLVM module, which can 
then be optimized very aggressively. So call graphs inside the 
binary are taken care of, so if it's a well encapsulated library 
with few (or expensive) calls to externals, it doesn't matter 
much.

druntime and Phobos are treated as externals. But Johan Engelen 
already pointed out that LDC could ship with them as LLVM bitcode 
libraries and then link them in before machine code generation...

Aug 25 2016

Basile B. <b2.temp gmx.com> writes:

On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
 I found it hard to believe LDC generates such crappy code when

Yes that's right, there was an error in my script! What I've 
posted is actually the asm without -O.

 Sure, Foo.foo() and use() could return a constant, but 
 otherwise it can't get much better than this.

The problem here that the example is bad with too agressive 
optimizations because the CALLs are eliminated despite of no 
inlining.

Here's a better code to illustrate the idea:

°°°°°°°°°°°°°°°°°°°°°°
interface Foo
{
     int foo() const;
}

int use(const(Foo) foo)
{
     return foo.foo() + foo.foo();
}
°°°°°°°°°°°°°°°°°°°°°°


And I'd expect this asm for a 'const optimization' in use():

push rbp
push rbx
push rax
mov rbx, rdi
mov rax, qword ptr [rbx]
call qword ptr [rax+08h]
add eax, eax // const funct = no side effect, so add the 1st 
result = save a CALL
add rsp, 08h
pop rbx
pop rbp
ret

Aug 25 2016

ag0aep6g <anonymous example.com> writes:

On 08/25/2016 08:15 PM, Basile B. wrote:
 Here's a better code to illustrate the idea:

 °°°°°°°°°°°°°°°°°°°°°°
 interface Foo
 {
     int foo() const;
 }

 int use(const(Foo) foo)
 {
     return foo.foo() + foo.foo();
 }
 °°°°°°°°°°°°°°°°°°°°°°


 And I'd expect this asm for a 'const optimization' in use():

 push rbp
 push rbx
 push rax
 mov rbx, rdi
 mov rax, qword ptr [rbx]
 call qword ptr [rax+08h]
 add eax, eax // const funct = no side effect, so add the 1st result =
 save a CALL
 add rsp, 08h
 pop rbx
 pop rbp
 ret

At least, foo needs to be `pure` for that.

Aug 25 2016

kinke <noone nowhere.com> writes:

On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
 The problem here that the example is bad with too agressive 
 optimizations because the CALLs are eliminated despite of no 
 inlining.

 [...]

 int use(const(Foo) foo)
 {
     return foo.foo() + foo.foo();
 }

 From my perspective, the problem with this example isn't missed 
optimization potential. It's the code itself. Why waste 
implementation efforts for such optimizations, if that would only 
reward people writing such ugly code with an equal performance to 
a more sane `2 * foo.foo()`? The latter is a) shorter, b) also 
faster with optimizations turned off and c) IMO simply clearer.

Aug 25 2016

Basile B. <b2.temp gmx.com> writes:

On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
 On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
 From my perspective, the problem with this example isn't missed 
 optimization potential. It's the code itself. Why waste 
 implementation efforts for such optimizations, if that would 
 only reward people writing such ugly code with an equal 
 performance to a more sane `2 * foo.foo()`? The latter is a) 
 shorter, b) also faster with optimizations turned off and c) 
 IMO simply clearer.

You're too focused on the example itself (Let's find an non 
trivial example, but then the asm generated would be longer). The 
point you miss is that it just *illustrates* what should happend 
when many calls to a pure const function are occur in a single 
sub program.

Aug 25 2016

kink <noone nowhere.com> writes:

On Friday, 26 August 2016 at 05:50:52 UTC, Basile B. wrote:
 On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
 On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
 From my perspective, the problem with this example isn't 
 missed optimization potential. It's the code itself. Why waste 
 implementation efforts for such optimizations, if that would 
 only reward people writing such ugly code with an equal 
 performance to a more sane `2 * foo.foo()`? The latter is a) 
 shorter, b) also faster with optimizations turned off and c) 
 IMO simply clearer.

 You're too focused on the example itself (Let's find an non 
 trivial example, but then the asm generated would be longer). 
 The point you miss is that it just *illustrates* what should 
 happend when many calls to a pure const function are occur in a 
 single sub program.

I know that it's just an illustration. But I surely don't like 
any function with repeated calls to this pure function. Why not 
have the developer code in a sensible style (cache that result 
once for that whole 'subprogram' manually) if performance is a 
concern? A compiler penalizing such bad coding style is 
absolutely fine by me.

Aug 26 2016

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.08.2016 10:44, kink wrote:
 On Friday, 26 August 2016 at 05:50:52 UTC, Basile B. wrote:
 On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
 On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
 From my perspective, the problem with this example isn't missed
 optimization potential. It's the code itself. Why waste
 implementation efforts for such optimizations, if that would only
 reward people writing such ugly code with an equal performance to a
 more sane `2 * foo.foo()`? The latter is a) shorter, b) also faster
 with optimizations turned off and c) IMO simply clearer.

 You're too focused on the example itself (Let's find an non trivial
 example, but then the asm generated would be longer). The point you
 miss is that it just *illustrates* what should happend when many calls
 to a pure const function are occur in a single sub program.

 I know that it's just an illustration.  But I surely don't like any
 function with repeated calls to this pure function. Why not have the
 developer code in a sensible style (cache that result once for that
 whole 'subprogram' manually) if performance is a concern?

Better performance is better even when it is not the primary concern.

 A compiler penalizing such bad coding style is absolutely fine by me.

It's not the compiler's business to judge coding style, also:

// original code. not "bad".

int foo(int x) pure{ ... }

int bar(int x) pure{ return foo(x) + foo(5-x); }

void main(){
     writeln(bar(5));
}

// ==> inlining

void main(){
     writeln(foo(5)+foo(10-5));
}

// ==> constant folding, "bad" code

void main(){
     writeln(foo(5)+foo(5));
}

Aug 26 2016

kink <noone nowhere.com> writes:

On Friday, 26 August 2016 at 09:30:52 UTC, Timon Gehr wrote:
 Better performance is better even when it is not the primary 
 concern.
 It's not the compiler's business to judge coding style

It's free to choose not to implement complex optimizations just 
so that people get super duper performance for crappy code, 
especially with the limited manpower we got. Fixing severe DMD 
bugs (there are still enough of those) is 100x more important.

 // original code. not "bad".

 int foo(int x) pure{ ... }

 int bar(int x) pure{ return foo(x) + foo(5-x); }

 void main(){
     writeln(bar(5));
 }

 // ==> inlining

 void main(){
     writeln(foo(5)+foo(10-5));
 }

 // ==> constant folding, "bad" code

 void main(){
     writeln(foo(5)+foo(5));
 }

Inlining and subsequent constant folding are only available if 
the callee isn't an external. For those cases, existing LLVM/GCC 
optimizations kick in and render at least this idea (automatic 
caching of pure function calls) obsolete (in most cases), see the 
LDC asm.
This is my main point. Higher-level, D-specific optimizations 
would be nice, but modern backends, coupled with optimized builds 
(`ldc2 -singleobj m1.d m2.d ...`) eat some of the ideas here for 
breakfast. So I'd focus on cases where LDC/GDC don't exploit 
optimization potential already.

Aug 26 2016

Chris Wright <dhasenan gmail.com> writes:

On Fri, 26 Aug 2016 10:32:55 +0000, kink wrote:
 Inlining and subsequent constant folding are only available if the
 callee isn't an external.

The optimizations being discussed are a result of purity guarantees that 
have nothing to do with inlining. The example showed constant folding as 
a way to get code that the compiler could obviously optimize (assuming it 
took advantage of purity) without that fact necessarily being obvious to 
the programmer.

These purity guarantees are available even if the compiler has no access 
to the source code of the pure function.

As a more concrete example, let's suppose I am dynamically linking to a 
compression library and compiling with a D interface file rather than 
full source. It has as one of its functions:

extern(D) size_t maxEncodedSize(size_t inputSize) pure;

I have a couple structs that happen to be the same size:

struct EntityID {
  ulong id;
  ulong family;
}
struct GPSCoordinates {
  double latitude, longitude;
}

I need to compress both of them and send them over the wire in one 
bundle, and I want to allocate a buffer in advance, so I write:

auto size =
    maxEncodedSize(EntityID.sizeof) +
    maxEncodedSize(GPSCoordinates.sizeof);
ubyte[] buf = new ubyte[size];

With pure, the compiler can rewrite that to:

ulong size = maxEncodedSize(16) << 1;
ubyte[] buf = new ubyte[size];

No inlining. No constant folding.

Aug 26 2016

Basile B. <b2.temp gmx.com> writes:

On Friday, 26 August 2016 at 08:44:54 UTC, kink wrote:
 On Friday, 26 August 2016 at 05:50:52 UTC, Basile B. wrote:
 On Thursday, 25 August 2016 at 22:37:13 UTC, kinke wrote:
 On Thursday, 25 August 2016 at 18:15:47 UTC, Basile B. wrote:
 From my perspective, the problem with this example isn't 
 missed optimization potential. It's the code itself. Why 
 waste implementation efforts for such optimizations, if that 
 would only reward people writing such ugly code with an equal 
 performance to a more sane `2 * foo.foo()`? The latter is a) 
 shorter, b) also faster with optimizations turned off and c) 
 IMO simply clearer.

 You're too focused on the example itself (Let's find an non 
 trivial example, but then the asm generated would be longer). 
 The point you miss is that it just *illustrates* what should 
 happend when many calls to a pure const function are occur in 
 a single sub program.

 I know that it's just an illustration. But I surely don't like 
 any function with repeated calls to this pure function. Why not 
 have the developer code in a sensible style (cache that result 
 once for that whole 'subprogram' manually) if performance is a 
 concern? A compiler penalizing such bad coding style is 
 absolutely fine by me.

To be honnest I would have expected another criticism against 
this optimization idea, which is that the aggregate can expose 
shared methods that could, this time, modify the state, 
invalidating the automatic cache produced by the optim for one 
thread, even if this scenario is not plausible without hacks (e.g 
casting away the "shared" part in the type of a variable that's 
used in the const pure function.

Aug 26 2016

Cecil Ward <d cecilward.com> writes:

On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
 [...]

 I'll add

 * create temporaries based on the const function attribute.

 I don't know why but I believed that it was already the case. 
 After disassembling a short test with DMD and LDMD2 it appears 
 clearly that this is not true:

 °°°°°°°°°°°°°°°°°°°°°°°°°°
 struct Foo
 {
     immutable _u = 8;
     int foo() const
     {
         return 8 * _u;
     }
 }
 int use(ref const(Foo) foo)
 {
     return foo.foo() + foo.foo();
 }
 °°°°°°°°°°°°°°°°°°°°°°°°°°

 disasm of use (LDC2 via LDMD2, -O -release)

 0000000000402930h  sub rsp, 18h
 0000000000402934h  mov qword ptr [rsp+10h], rdi
 0000000000402939h  call 00000000004028F0h ; (Foo.foo)
 000000000040293Eh  mov rdi, qword ptr [rsp+10h]
 0000000000402943h  mov dword ptr [rsp+0Ch], eax
 0000000000402947h  call 00000000004028F0h ; (Foo.foo)
 000000000040294Ch  mov ecx, dword ptr [rsp+0Ch]
 0000000000402950h  add ecx, eax
 0000000000402952h  mov eax, ecx
 0000000000402954h  add rsp, 18h
 0000000000402958h  ret

 But Foo.foo constness guarantees that Foo state is not 
 modified. So the result of the first CALL could be cached in a 
 temporary and reused instead of the second CALL. This would 
 help for example in loops when a getter function is called to 
 know the iteration count.

The problem of the non-caching of appropriate function calls is 
not confined to methods. It also is observable when calling 
explicitly pure-marked external functions, eg. my_pure() + 
my_pure() makes two calls. (Checked in GCC -O3, with an extern 
pure-marked function.)

This is often covered up by inlining with full expansion, as 
non-extern functions don't show this.

Aug 25 2016

Johan Engelen <j j.nl> writes:

On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 
 I'll add

 * create temporaries based on the const function attribute.

Struct method constness (as in your example) does not mean that 
the return value is constant when calling it twice in a row. As 
pointed out by others, the function needs to be pure. Dlang 
pureness is not a strong enough guarantee.
For example, this is explicitly allowed by the spec:
```
pure int foo()
{
     debug writeln("in foo()"); // ok, impure code allowed in 
debug statement
     return 1;
}
```
That makes it illegal to transform `foo()+foo()` to `a=foo(); 
a+a`, at least in debug builds.

David discusses your proposed optimization, and why it cannot be 
done in general (!) on Dlang pure functions.
http://klickverbot.at/blog/2012/05/purity-in-d/

-Johan

Aug 26 2016

Meta <jared771 gmail.com> writes:

On Friday, 26 August 2016 at 10:51:15 UTC, Johan Engelen wrote:
 On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 
 I'll add

 * create temporaries based on the const function attribute.

 Struct method constness (as in your example) does not mean that 
 the return value is constant when calling it twice in a row. As 
 pointed out by others, the function needs to be pure. Dlang 
 pureness is not a strong enough guarantee.
 For example, this is explicitly allowed by the spec:
 ```
 pure int foo()
 {
     debug writeln("in foo()"); // ok, impure code allowed in 
 debug statement
     return 1;
 }
 ```
 That makes it illegal to transform `foo()+foo()` to `a=foo(); 
 a+a`, at least in debug builds.

 David discusses your proposed optimization, and why it cannot 
 be done in general (!) on Dlang pure functions.
 http://klickverbot.at/blog/2012/05/purity-in-d/

 -Johan

Here's an example that doesn't even need to use debug statements, 
and is perfectly legal.

class Test
{
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
}

import std.stdio;

void main()
{
     auto t = new Test();
     writeln(t.getN()); //Prints 0
     t.setN(1);
     writeln(t.getN()); //Prints 1
}

Aug 26 2016

Basile B. <b2.temp gmx.com> writes:

On Friday, 26 August 2016 at 14:03:13 UTC, Meta wrote:
 On Friday, 26 August 2016 at 10:51:15 UTC, Johan Engelen wrote:
 On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 
 I'll add

 * create temporaries based on the const function attribute.

 Struct method constness (as in your example) does not mean 
 that the return value is constant when calling it twice in a 
 row. As pointed out by others, the function needs to be pure. 
 Dlang pureness is not a strong enough guarantee.
 For example, this is explicitly allowed by the spec:
 ```
 pure int foo()
 {
     debug writeln("in foo()"); // ok, impure code allowed in 
 debug statement
     return 1;
 }
 ```
 That makes it illegal to transform `foo()+foo()` to `a=foo(); 
 a+a`, at least in debug builds.

 David discusses your proposed optimization, and why it cannot 
 be done in general (!) on Dlang pure functions.
 http://klickverbot.at/blog/2012/05/purity-in-d/

 -Johan

 Here's an example that doesn't even need to use debug 
 statements, and is perfectly legal.

 class Test
 {
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
 }

 import std.stdio;

 void main()
 {
     auto t = new Test();
     writeln(t.getN()); //Prints 0
     t.setN(1);
     writeln(t.getN()); //Prints 1
 }

That's another story. Of course the optimization pass for this 
should check that **ALL** the calls to Test in a sub program (or 
in this scope if you want) are const... Which is not the case 
here.

Aug 26 2016

Meta <jared771 gmail.com> writes:

On Friday, 26 August 2016 at 14:12:24 UTC, Basile B. wrote:
 That's another story. Of course the optimization pass for this 
 should check that **ALL** the calls to Test in a sub program 
 (or in this scope if you want) are const... Which is not the 
 case here.

My point is that getN is strongly pure but is not referentially 
transparent. Purity is broken in a strange way that you wouldn't 
expect, and you don't even have to use an escape hatch like debug.

Aug 26 2016

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Friday, 26 August 2016 at 17:52:36 UTC, Meta wrote:
 On Friday, 26 August 2016 at 14:12:24 UTC, Basile B. wrote:
 That's another story. Of course the optimization pass for this 
 should check that **ALL** the calls to Test in a sub program 
 (or in this scope if you want) are const... Which is not the 
 case here.

 My point is that getN is strongly pure but is not referentially 
 transparent. Purity is broken in a strange way that you 
 wouldn't expect, and you don't even have to use an escape hatch 
 like debug.

getN() is not pure at all, it refers to an object (in C term) 
outside of its scope not depending on its parameter. It is 
irrelevant if the outside object is a global variable or a 
member. Access to a memory defined outside of its scope breaks 
pureness.

Aug 26 2016

Basile B. <b2.temp gmx.com> writes:

On Friday, 26 August 2016 at 14:12:24 UTC, Basile B. wrote:
 On Friday, 26 August 2016 at 14:03:13 UTC, Meta wrote:
 On Friday, 26 August 2016 at 10:51:15 UTC, Johan Engelen wrote:
 On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 
 I'll add

 * create temporaries based on the const function attribute.

 Struct method constness (as in your example) does not mean 
 that the return value is constant when calling it twice in a 
 row. As pointed out by others, the function needs to be pure. 
 Dlang pureness is not a strong enough guarantee.
 For example, this is explicitly allowed by the spec:
 ```
 pure int foo()
 {
     debug writeln("in foo()"); // ok, impure code allowed in 
 debug statement
     return 1;
 }
 ```
 That makes it illegal to transform `foo()+foo()` to `a=foo(); 
 a+a`, at least in debug builds.

 David discusses your proposed optimization, and why it cannot 
 be done in general (!) on Dlang pure functions.
 http://klickverbot.at/blog/2012/05/purity-in-d/

 -Johan

 Here's an example that doesn't even need to use debug 
 statements, and is perfectly legal.

 class Test
 {
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
 }

 import std.stdio;

 void main()
 {
     auto t = new Test();
     writeln(t.getN()); //Prints 0
     t.setN(1);
     writeln(t.getN()); //Prints 1
 }

 That's another story. Of course the optimization pass for this 
 should check that **ALL** the calls to Test in a sub program 
 (or in this scope if you want) are const... Which is not the 
 case here.

void foo(Bar bar)
{
     foreach(i; 0 .. bar.length) // bar.length can be evaluated 
once
     {
        stuffA.a = bar[i].propertyA // bar[i] can be cached
        stuffB.b = bar[i].propertyB // to get propertyB
     }
}

with

interface Bar
{
     size_t length() pure const;
     Stuff opIndex(size_t) pure const;
}

Aug 26 2016

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Friday, 26 August 2016 at 14:03:13 UTC, Meta wrote:
 On Friday, 26 August 2016 at 10:51:15 UTC, Johan Engelen wrote:
 On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 
 I'll add

 * create temporaries based on the const function attribute.

 Struct method constness (as in your example) does not mean 
 that the return value is constant when calling it twice in a 
 row. As pointed out by others, the function needs to be pure. 
 Dlang pureness is not a strong enough guarantee.
 For example, this is explicitly allowed by the spec:
 ```
 pure int foo()
 {
     debug writeln("in foo()"); // ok, impure code allowed in 
 debug statement
     return 1;
 }
 ```
 That makes it illegal to transform `foo()+foo()` to `a=foo(); 
 a+a`, at least in debug builds.

 David discusses your proposed optimization, and why it cannot 
 be done in general (!) on Dlang pure functions.
 http://klickverbot.at/blog/2012/05/purity-in-d/

 -Johan

 Here's an example that doesn't even need to use debug 
 statements, and is perfectly legal.

 class Test
 {
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
 }

getN() is not pure, simple as that (and an ideal compiler should 
complain in that case). A function is pure if it depends only of 
the state passed as parameter. If it touches memory that is set 
outside its scope it is NOT PURE!!!

 import std.stdio;

 void main()
 {
     auto t = new Test();
     writeln(t.getN()); //Prints 0
     t.setN(1);
     writeln(t.getN()); //Prints 1

It has to, as getN() is not pure.

 }

In case you didn't get it, getN() is not pure. Marking it as such 
is a BUG!.

Aug 26 2016

ag0aep6g <anonymous example.com> writes:

On 08/26/2016 09:51 PM, Patrick Schluter wrote:
 On Friday, 26 August 2016 at 14:03:13 UTC, Meta wrote:

[...]
 class Test
 {
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
 }

 getN() is not pure, simple as that (and an ideal compiler should
 complain in that case). A function is pure if it depends only of the
 state passed as parameter. If it touches memory that is set outside its
 scope it is NOT PURE!!!

You can rewrite that code like this:

----
class Test {/* ... as before but without getN ... */}
int getN(const Test test) pure { return test.n; }
----

Is getN pure now? It only touches memory via the parameter.

For methods we can think of `this` as a hidden parameter. If the method 
is tagged `const` or `immutable`, it's really that hidden parameter that 
is being qualified.

Aug 26 2016

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Friday, 26 August 2016 at 19:58:47 UTC, ag0aep6g wrote:
 On 08/26/2016 09:51 PM, Patrick Schluter wrote:
 On Friday, 26 August 2016 at 14:03:13 UTC, Meta wrote:

 [...]
 class Test
 {
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
 }

 getN() is not pure, simple as that (and an ideal compiler 
 should
 complain in that case). A function is pure if it depends only 
 of the
 state passed as parameter. If it touches memory that is set 
 outside its
 scope it is NOT PURE!!!

 You can rewrite that code like this:

 ----
 class Test {/* ... as before but without getN ... */}
 int getN(const Test test) pure { return test.n; }
 ----

 Is getN pure now? It only touches memory via the parameter.

 For methods we can think of `this` as a hidden parameter. If 
 the method is tagged `const` or `immutable`, it's really that 
 hidden parameter that is being qualified.

Yes. The optimisation of removing the second call is only 
possible if there is no access using the this pointer. The call 
to setN() (or any member function using the mutable this 
pointer), will mandate the compiler to call getN() again.

Aug 26 2016

ag0aep6g <anonymous example.com> writes:

On 08/26/2016 10:09 PM, Patrick Schluter wrote:
 Yes. The optimisation of removing the second call is only possible if
 there is no access using the this pointer. The call to setN() (or any
 member function using the mutable this pointer), will mandate the
 compiler to call getN() again.

If setN is called or not does not affect getN's purity, though. You 
wrote that (the method) getN is not pure and should be rejected by the 
compiler, but both variants (method and function) are equally pure or 
impure.

Aug 26 2016

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Friday, 26 August 2016 at 20:35:13 UTC, ag0aep6g wrote:
 On 08/26/2016 10:09 PM, Patrick Schluter wrote:
 Yes. The optimisation of removing the second call is only 
 possible if
 there is no access using the this pointer. The call to setN() 
 (or any
 member function using the mutable this pointer), will mandate 
 the
 compiler to call getN() again.

 If setN is called or not does not affect getN's purity, though. 
 You wrote that (the method) getN is not pure and should be 
 rejected by the compiler, but both variants (method and 
 function) are equally pure or impure.

Yeah, I was wrong, the function getN() is pure (but not const). 
The thing is, that the compiler can remove a second call to a 
pure function only if it can make sure that there are no memory 
changes via one of the pointers of the functions. To give an 
example coming from C (sorry, I'm nearly only a C man).
strlen(p) is pure. A second call to strlen(p) can only by removed 
if the compiler can guarantee that no change was done to the 
memory pointed to by p. Unfortunately that does not happen really 
that often. Even a simple call to strcpy(whatever, p) will break 
the optimisation (char * are always potentially aliased).
So again, sorry for posting too quickly, shouldn't post after 
drinking Belgian beer ;-)

Aug 26 2016

ag0aep6g <anonymous example.com> writes:

On 08/26/2016 10:48 PM, Patrick Schluter wrote:
 the function getN() is pure (but not const).

How is it not const? It doesn't alter the object. setN is the non-const one.

Aug 26 2016

Meta <jared771 gmail.com> writes:

On Friday, 26 August 2016 at 19:51:02 UTC, Patrick Schluter wrote:
 On Friday, 26 August 2016 at 14:03:13 UTC, Meta wrote:
 On Friday, 26 August 2016 at 10:51:15 UTC, Johan Engelen wrote:
 On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:
 
 I'll add

 * create temporaries based on the const function attribute.

 Struct method constness (as in your example) does not mean 
 that the return value is constant when calling it twice in a 
 row. As pointed out by others, the function needs to be pure. 
 Dlang pureness is not a strong enough guarantee.
 For example, this is explicitly allowed by the spec:
 ```
 pure int foo()
 {
     debug writeln("in foo()"); // ok, impure code allowed in 
 debug statement
     return 1;
 }
 ```
 That makes it illegal to transform `foo()+foo()` to `a=foo(); 
 a+a`, at least in debug builds.

 David discusses your proposed optimization, and why it cannot 
 be done in general (!) on Dlang pure functions.
 http://klickverbot.at/blog/2012/05/purity-in-d/

 -Johan

 Here's an example that doesn't even need to use debug 
 statements, and is perfectly legal.

 class Test
 {
     int n;

     void setN(int val) pure
     {
         n = val;
     }

     int getN() const pure
     {
         return n;
     }
 }

 getN() is not pure, simple as that (and an ideal compiler 
 should complain in that case). A function is pure if it depends 
 only of the state passed as parameter. If it touches memory 
 that is set outside its scope it is NOT PURE!!!

 import std.stdio;

 void main()
 {
     auto t = new Test();
     writeln(t.getN()); //Prints 0
     t.setN(1);
     writeln(t.getN()); //Prints 1

 It has to, as getN() is not pure.

 }

 In case you didn't get it, getN() is not pure. Marking it as 
 such is a BUG!.

Not according to the D spec.

Aug 26 2016

Cecil Ward <d cecilward.com> writes:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
...
 A useful phrase I saw today: “declaration of intent given by 
 the programmer to the compiler”.

Particular dream wish-list items of mine: some kind of mechanism 
that could express  possible operator properties, classes 
properties and arithmetic identities, identity operations. 
Examples:
* commutativity;
* 1.0 / (1.0 / x) ≡ x, with or without ignoring of zero and 
ignoring of IEEE-weirdos; or
* sin²(x) +cos²(x) ≡ 1
* special values of objects such zero, and one, so that that (x ⊛ 
zero) ≡ x, and that (zero ⊛ x) ≡ x
* D strings can be first, so that x ~ "" ≡  x, arrays too
* arithmetic operators’ properties and identities a they apply to 
complex numbers

Another dream: Strength reductions so that sequences / patterns 
of operators (back to identities again, sort-of) could be mapped 
to named helper functions or operators. For example, with 
strings: s1 ~ s2 ~ s3 ~ … → StringArrayConcat( [] )

Aug 25 2016

Cecil Ward <d cecilward.com> writes:

On Thursday, 25 August 2016 at 18:17:21 UTC, Cecil Ward wrote:
 On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:

 * special values of objects such zero, and one, so that that (x 
 ⊛ zero) ≡ x, and that (zero ⊛ x) ≡ x

(Should of course read
         (x ⊛ zero) ≡ zero, and that (one ⊛ x) ≡ x
if you take the operator as being like multiplication.)

Aug 25 2016

D Programming

C/C++ Programming

Other

digitalmars.D - Optimisation possibilities: current, future and enhancements