www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Some nice new DMD slicing optimizations

reply Walter Bright <newshound2 digitalmars.com> writes:
https://github.com/dlang/dmd/pull/6176

I'm happy to report that DMD has (finally!) gotten some significant new 
optimizations! Specifically, 'slicing' a two register wide aggregate into two 
register-sized variables, enabling much better enregistering.

Given the code:

void foo(int[] a, int[] b, int[] c) {
     foreach (i; 0 .. a.length)
         a[i] = b[i] + c[i];
}

the inner loop formerly compiled to:

LA:             mov     EAX,018h[ESP]
                 mov     EDX,010h[ESP]
                 mov     ECX,[EBX*4][EAX]
                 add     ECX,[EBX*4][EDX]
                 mov     ESI,020h[ESP]
                 mov     [EBX*4][ESI],ECX
                 inc     EBX
                 cmp     EBX,01Ch[ESP]
                 jb      LA
and now:

L1A:            mov     ECX,[EBX*4][EDI]
                 add     ECX,[EBX*4][ESI]
                 mov     0[EBX*4][EBP],ECX
                 inc     EBX
                 cmp     EBX,EDX
                 jb      L1A

I've been wanting to do this for years, and finally got around to it. (I also 
thought of a simpler way to implement it, which helped a lot.)

Further work will be in widening what this applies to.
Oct 06 2016
next sibling parent Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Friday, 7 October 2016 at 06:07:47 UTC, Walter Bright wrote:
 https://github.com/dlang/dmd/pull/6176

 I'm happy to report that DMD has (finally!) gotten some 
 significant new optimizations! Specifically, 'slicing' a two 
 register wide aggregate into two register-sized variables, 
 enabling much better enregistering.

 [...]
Awesome!
Oct 06 2016
prev sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 07/10/2016 7:07 PM, Walter Bright wrote:
 https://github.com/dlang/dmd/pull/6176

 I'm happy to report that DMD has (finally!) gotten some significant new
 optimizations! Specifically, 'slicing' a two register wide aggregate
 into two register-sized variables, enabling much better enregistering.

 Given the code:

 void foo(int[] a, int[] b, int[] c) {
     foreach (i; 0 .. a.length)
         a[i] = b[i] + c[i];
 }

 the inner loop formerly compiled to:

 LA:             mov     EAX,018h[ESP]
                 mov     EDX,010h[ESP]
                 mov     ECX,[EBX*4][EAX]
                 add     ECX,[EBX*4][EDX]
                 mov     ESI,020h[ESP]
                 mov     [EBX*4][ESI],ECX
                 inc     EBX
                 cmp     EBX,01Ch[ESP]
                 jb      LA
 and now:

 L1A:            mov     ECX,[EBX*4][EDI]
                 add     ECX,[EBX*4][ESI]
                 mov     0[EBX*4][EBP],ECX
                 inc     EBX
                 cmp     EBX,EDX
                 jb      L1A

 I've been wanting to do this for years, and finally got around to it. (I
 also thought of a simpler way to implement it, which helped a lot.)

 Further work will be in widening what this applies to.
If there is bound checking shouldn't there be a check to guarantee b and c and >= a.length? Otherwise, awesome!
Oct 06 2016
next sibling parent Ilya Yaroshenko <ilyayaroshenko gmail.com> writes:
On Friday, 7 October 2016 at 06:30:32 UTC, rikki cattermole wrote:
 On 07/10/2016 7:07 PM, Walter Bright wrote:
 [...]
If there is bound checking shouldn't there be a check to guarantee b and c and >= a.length? Otherwise, awesome!
The function is not safe. So there are no checks in release mode.
Oct 07 2016
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/6/2016 11:30 PM, rikki cattermole wrote:
 If there is bound checking shouldn't there be a check to guarantee b and c and
= a.length?
I set -noboundscheck
Oct 07 2016