www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - #dbugfix Issue 16189

reply Kirr <kkryukov gmail.com> writes:
https://issues.dlang.org/show_bug.cgi?id=16189

This is a P1 critical DMD bug, silently generating wrong code. 
Reported one and a half year ago, but exists in DMD compilers 
going back to DMD 2.050 (possibly more, but this is as far back 
as I tried).

The repro is essentially C, it involves nothing but array, struct 
and loop. No templates, CTFE, no phobos, no any fancy features.

I can fully understand that may be this bug is hard to fix, that 
there's not enough manpower, that there are other more pressing 
issues, that no one can be expected to work on something they are 
not interested in, etc. And may I'm the only one affected? (it's 
a mystery to me). But this bug is a phychological obstacle to my 
use of D (or rather DMD, because LDC seems OK). May be it's just 
my expectation that is too high.

I'm not ready to give up arrays, structs or loops. I 
experimentally worked around this once I found it, by re-wording 
the loop. But I've no idea how to make sure this bug won't hit me 
somewhere else. I guess I can live without optimizer (or without 
DMD). In any case, I think that miscompiling a simplest C-like 
loop does not send the right message to a newcomer.

I'll appreciate any help with this bug.
Feb 07
next sibling parent reply Michael <michael toohuman.io> writes:
On Wednesday, 7 February 2018 at 08:51:01 UTC, Kirr wrote:
 https://issues.dlang.org/show_bug.cgi?id=16189

 This is a P1 critical DMD bug, silently generating wrong code. 
 Reported one and a half year ago, but exists in DMD compilers 
 going back to DMD 2.050 (possibly more, but this is as far back 
 as I tried).

 The repro is essentially C, it involves nothing but array, 
 struct and loop. No templates, CTFE, no phobos, no any fancy 
 features.

 I can fully understand that may be this bug is hard to fix, 
 that there's not enough manpower, that there are other more 
 pressing issues, that no one can be expected to work on 
 something they are not interested in, etc. And may I'm the only 
 one affected? (it's a mystery to me). But this bug is a 
 phychological obstacle to my use of D (or rather DMD, because 
 LDC seems OK). May be it's just my expectation that is too high.

 I'm not ready to give up arrays, structs or loops. I 
 experimentally worked around this once I found it, by 
 re-wording the loop. But I've no idea how to make sure this bug 
 won't hit me somewhere else. I guess I can live without 
 optimizer (or without DMD). In any case, I think that 
 miscompiling a simplest C-like loop does not send the right 
 message to a newcomer.

 I'll appreciate any help with this bug.
Yeah that's... really bad. I suppose the saving grace is that it's only a serious issue with the use of the optimiser, which I don't tend to use on DMD, but this should really be prioritised. I guess the fact that it's not a regression might make fixing it harder, but it doesn't really matter. Does it work with slightly varied examples like where a = -1, and is incremented etc.?
Feb 07
parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Wednesday, 7 February 2018 at 11:37:19 UTC, Michael wrote:
 Does it work with slightly varied examples like where a = -1, 
 and is incremented etc.?
I played with it here https://run.dlang.io/is/15sr6c and every variation I tried worked (correctly). So the example seems to be pretty minimal. Maybe someone understanding assembly is able to see what is going on after pressing that nice [ASM] button?
Feb 07
next sibling parent Seb <seb wilzba.ch> writes:
On Wednesday, 7 February 2018 at 13:50:06 UTC, Bastiaan Veelo 
wrote:
 On Wednesday, 7 February 2018 at 11:37:19 UTC, Michael wrote:
 Does it work with slightly varied examples like where a = -1, 
 and is incremented etc.?
I played with it here https://run.dlang.io/is/15sr6c and every variation I tried worked (correctly). So the example seems to be pretty minimal. Maybe someone understanding assembly is able to see what is going on after pressing that nice [ASM] button?
You don't need to understand assembler to grasp what's going on. A printf does a good job too: printf("%d\n", a); // 1908874352 It's even more fun, reduce the size of the static array to 8: ubyte[8][1] data; // works However, with 7 it fails again: ubyte[7][1] data; If you look at the assembly you will see that the compiler didn't even include the assert calls for even static arrays: https://run.dlang.io/is/Qkt1nA (-O) https://run.dlang.io/is/7UfmXJ (-O [8]) https://run.dlang.io/is/QrtNeI (normal) Here's an annotated excerpt from the normal build: --- dec qword ptr -010h[RBP] xor EDX,EDX ; a-- mov -8[RBP],DL test DL,DL ; if (b) je L5C jmp short L18 ; goto loop L5C: cmp qword ptr -010h[RBP],0FFFFFFFFFFFFFFFFh je L78 ; assert(a == -1) mov DL,0Ah lea RSI,_TMP0 PC32[RIP] lea RDI,_TMP0 PC32[RIP] call __assert PLT32 L78: xor EAX,EAX --- The same looks slightly different with -O. Here with printf and a different value to be compared with a because it's easier to read: --- main: push RBP mov RBP,RSP sub RSP,010h lea RAX,-010h[RBP] xor ECX,ECX mov [RAX],RCX mov 8[RAX],CL lea RSI,-010h[RBP] lea RDI,-010h[RBP] movsd movsb mov RSI,01C71C71C71C71C70h ; 2 function argument (value of a) lea RDI,FLAT:.rodata[00h][RIP] ; "%d" (1st function argument) xor EAX,EAX ; set eax to 0 call printf PLT32 ; printf("%d", a) mov EDX,0Ch lea RSI,_TMP0 PC32[RIP] ; load function arguments (in reverse order) lea RDI,_TMP0 PC32[RIP] call __assert PLT32 ; values load, let's call assert add [RAX],AL .text.main ends end ---- So the optimizer seems to be confused and wrongly precomputes a. Note that if you change something, a will be correctly statically set in the printf too: --- mov RSI,0FFFFFFFFh ; look ma - a is now -1 lea RDI,FLAT:.rodata[00h][RIP] xor EAX,EAX ; set eax to 0 call printf PLT32 ---
Feb 07
prev sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Bastiaan Veelo wrote:

 On Wednesday, 7 February 2018 at 11:37:19 UTC, Michael wrote:
 Does it work with slightly varied examples like where a = -1, and is 
 incremented etc.?
I played with it here https://run.dlang.io/is/15sr6c and every variation I tried worked (correctly). So the example seems to be pretty minimal. Maybe someone understanding assembly is able to see what is going on after pressing that nice [ASM] button?
there is a mix of loop strength reduction and data flow analysis. as inner loop contains array access and bounds checking, optimizer decided to turn `a` into actual index (i.e. multiply it by 16), and use subtraction in loop body. so far so good, optimizer did a good job. but then we have `assert(a == -1);` after the loop. and that is where things goes off the road: optimizer knows that it has `a` in register, and it knows that `a` was pre-multiplied, so it decided to divide `a` to 16 to get the real value. but... it does this with `shr` instruction instead of `sar`! most of the time it works as expected, but in our case... oops, we lost a sign. the fix should be fairly easy, but sorry, i can't get any sense from dmd backend. i see what it wrong due to my expirience with similar things, but that's all. here Walter (or some other backand guru) should step in.
Feb 07
parent Kirr <kkryukov gmail.com> writes:
On Wednesday, 7 February 2018 at 18:46:50 UTC, ketmar wrote:
 ...
Great brainstorming, guys! Hopefuly the gained understanding will lead to eventual fix.
Feb 07
prev sibling parent Mike Parker <aldacron gmail.com> writes:
On Wednesday, 7 February 2018 at 08:51:01 UTC, Kirr wrote:
 https://issues.dlang.org/show_bug.cgi?id=16189

 I'll appreciate any help with this bug.
Noted!
Feb 07