www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.ldc - How to track down a bad llvm optimization pass

reply Joakim <dlang joakim.fea.st> writes:
Since the update to ddmdfe 2.070, a single assert trips up on 
Android/ARM when running the druntime/phobos tests:

https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716

I've copy-pasted the relevant lines of the test into another 
file, testconv.d:

version(unittest) import std.conv, std.array, std.range;
unittest
{
     auto r = toChars!(16)(16u);
     assert(r.length == 2);
     assert(r[1..2].array == "0");
}

The test runs fine with -O1, but with -O2 or -O3, ie the levels 
when inlining are enabled, the second assert fails.  If I compile 
with -O2/3 -disable-inlining _or_ comment out the first assert, 
it passes.  Here's the IR generated for the unittest block with 
the first assert commented out, and with it included.

./bin/ldc2 -unittest -O2 --output-ll -c testconv.d -of=without.ll

define void  _D8testconv14__unittestL2_1FZv() comdat {
   br label %forcond.i.i

forcond.i.i:                                      ; preds = 
%forcond.i.i, %0
   %indvars.iv.i.i = phi i32 [ %indvars.iv.next.i.i, %forcond.i.i 
], [ 1, %0 ] ; [#uses = 2, type = i32]
   %value.0.i.i = phi i32 [ %1, %forcond.i.i ], [ 16, %0 ] ; 
[#uses = 1, type = i32]
   %1 = lshr i32 %value.0.i.i, 4                   ; [#uses = 2]
   %2 = icmp eq i32 %1, 0                          ; [#uses = 1]
   %indvars.iv.next.i.i = add nuw nsw i32 %indvars.iv.i.i, 1 ; 
[#uses = 1]
   br i1 %2, label %bounds.ok.i, label %forcond.i.i

bounds.ok.i:                                      ; preds = 
%forcond.i.i
   %indvars.iv.i.i.lcssa = phi i32 [ %indvars.iv.i.i, %forcond.i.i 
] ; [#uses = 1, type = i32]
   %3 = tail call i8* 
 _D4core6memory2GC6mallocFNaNbkkxC8TypeInfoZPv(%object.TypeInfo* 
null, i32 2, i32 1) ; [#uses = 2]
   %4 = insertvalue { i32, i8* } { i32 1, i8* undef }, i8* %3, 1 ; 
[#uses = 1]
   %5 = shl i32 %indvars.iv.i.i.lcssa, 2           ; [#uses = 1]
   %6 = and i32 %5, 1020                           ; [#uses = 1]
   %7 = add nsw i32 %6, -12                        ; [#uses = 1]
   %8 = lshr i32 16, %7                            ; [#uses = 1]
   %9 = or i32 %8, 48                              ; [#uses = 1]
   %10 = trunc i32 %9 to i8                        ; [#uses = 1]
   store i8 %10, i8* %3, align 1
   %11 = tail call i32  _adEq2({ i32, i8* } %4, { i32, i8* } { i32 
1, i8* getelementptr inbounds ([2 x i8], [2 x i8]*  .str, i32 0, 
i32 0) }, %object.TypeInfo* nonnull  _D11TypeInfo_Aa6__initZ) #1 
; [#uses = 1]
   %12 = icmp eq i32 %11, 0                        ; [#uses = 1]
   br i1 %12, label %assertFailed, label %assertPassed

assertPassed:                                     ; preds = 
%bounds.ok.i
   ret void

assertFailed:                                     ; preds = 
%bounds.ok.i
   tail call void  _d_assert({ i32, i8* } { i32 10, i8* 
getelementptr inbounds ([11 x i8], [11 x i8]*  .str.1, i32 0, i32 
0) }, i32 4) #2
   unreachable
}

./bin/ldc2 -unittest -O2 --output-ll -c testconv.d -of=with.ll

define void  _D8testconv14__unittestL2_1FZv() comdat {
   br label %forcond.i.i

forcond.i.i:                                      ; preds = 
%forcond.i.i, %0
   %indvars.iv.i.i = phi i32 [ %indvars.iv.next.i.i, %forcond.i.i 
], [ 1, %0 ] ; [#uses = 2, type = i32]
   %value.0.i.i = phi i32 [ %1, %forcond.i.i ], [ 16, %0 ] ; 
[#uses = 1, type = i32]
   %1 = lshr i32 %value.0.i.i, 4                   ; [#uses = 2]
   %2 = icmp eq i32 %1, 0                          ; [#uses = 1]
   %indvars.iv.next.i.i = add nuw nsw i32 %indvars.iv.i.i, 1 ; 
[#uses = 1]
   br i1 %2, label 
%_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaN
NiNfkZ6Result.exit, label %forcond.i.i

_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaN
NiNfkZ6Result.exit: ; preds = %forcond.i.i
   %indvars.iv.i.i.lcssa = phi i32 [ %indvars.iv.i.i, %forcond.i.i 
] ; [#uses = 1, type = i32]
   %3 = and i32 %indvars.iv.i.i.lcssa, 255         ; [#uses = 1]
   %4 = icmp eq i32 %3, 2                          ; [#uses = 1]
   br i1 %4, label %bounds.ok.i, label %assertFailed

bounds.ok.i:                                      ; preds = 
%_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit
   %5 = tail call i8* 
 _D4core6memory2GC6mallocFNaNbkkxC8TypeInfoZPv(%object.TypeInfo* 
null, i32 2, i32 1) ; [#uses = 2]
   %6 = insertvalue { i32, i8* } { i32 1, i8* undef }, i8* %5, 1 ; 
[#uses = 1]
   store i8 -1, i8* %5, align 1
   %7 = tail call i32  _adEq2({ i32, i8* } %6, { i32, i8* } { i32 
1, i8* getelementptr inbounds ([2 x i8], [2 x i8]*  .str.1, i32 
0, i32 0) }, %object.TypeInfo* nonnull  _D11TypeInfo_Aa6__initZ) 
#3 ; [#uses = 1]
   %8 = icmp eq i32 %7, 0                          ; [#uses = 1]
   br i1 %8, label %assertFailed2, label %assertPassed1

assertFailed:                                     ; preds = 
%_D3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZS3std4conv47__T7toCharsVii16TaVE3std5ascii10LetterCasei1TkZ7toCharsFNaNbNiNfkZ6Result.exit
   tail call void  _d_assert({ i32, i8* } { i32 10, i8* 
getelementptr inbounds ([11 x i8], [11 x i8]*  .str, i32 0, i32 
0) }, i32 3) #2
   unreachable

assertPassed1:                                    ; preds = 
%bounds.ok.i
   ret void

assertFailed2:                                    ; preds = 
%bounds.ok.i
   tail call void  _d_assert({ i32, i8* } { i32 10, i8* 
getelementptr inbounds ([11 x i8], [11 x i8]*  .str, i32 0, i32 
0) }, i32 4) #2
   unreachable
}

Clearly the problem is that those seven instructions before the 
call to _adEq2 in the bounds.ok.i section of the first IR get 
turned into this nonsense instruction in the second IR:

store i8 -1, i8* %5, align 1

Why including the first assert combines with some optimization 
pass and inlining to produce this junk instruction instead, I 
don't know.  I don't think this is something that needs to be 
fixed on the ldc end, but who knows.  Anyone have any tips on 
tracking this down?
Jun 16 2016
next sibling parent reply David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc puremagic.com> writes:
On 17 Jun 2016, at 0:23, Joakim via digitalmars-d-ldc wrote:
 Why including the first assert combines with some optimization pass 
 and inlining to produce this junk instruction instead, I don't know.  
 I don't think this is something that needs to be fixed on the ldc end, 
 but who knows.  Anyone have any tips on tracking this down?
Without having looked at the details of what's going on here in particular, my first step would be to use the "-print-after-all" option and look where the part that you think is clearly wrong first appears (e.g. just by textual search). Having a look at what the input to that pass was (i.e. the previous pass result) can often yield extra insight. bugpoint, the LLVM tool for reducing test cases, also has some functionality to reduce the set of passes needed to trigger a miscompilation. It has worked for me in the past, but I always found configuring the linking/executing step to be a bit finicky. As a general comment, it is of course possible that we are hitting a genuine LLVM bug here, but I'd also be on the lookout for cases where we might be generating invalid IR (in the sense that we trigger some documented undefined behaviour, violate some assumptions, etc.). — David
Jun 17 2016
next sibling parent reply Kagamin <spam here.lot> writes:
On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
 As a general comment, it is of course possible that we are 
 hitting a genuine LLVM bug here, but I'd also be on the lookout 
 for cases where we might be generating invalid IR (in the sense 
 that we trigger some documented undefined behaviour, violate 
 some assumptions, etc.).
Can't it be ruled out by the `opt` tool? Though I thought it's strange how it works on ir-to-ir level.
Jun 17 2016
parent reply David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc puremagic.com> writes:
On 17 Jun 2016, at 13:39, Kagamin via digitalmars-d-ldc wrote:
 Can't it be ruled out by the `opt` tool? Though I thought it's strange 
 how it works on ir-to-ir level.
I'm not sure I understand. What would `opt` rule out exactly? Yes, it allows you to run a custom subset of passes – `ldc2 -O2 -debug-pass=Arguments …` is very useful for that, by the way – but you still need to manually remove subsets of passes and re-build the executable until the issue disappears. (As an aside: Now that LDC accepts bitcode files on the command line, this doesn't require manually messing with object file emission/linking anymore.) By the way, the fact that it both consumes and produces IR is actually not strange at all; that's simply how the main LLVM optimiser is architected (with the exception of a further, mostly target-specific, optimisation stage during code generation). — David
Jun 17 2016
parent Kagamin <spam here.lot> writes:
If `opt` converts valid ir to invalid ir, then it's not ldc's 
failure.
Jun 17 2016
prev sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
 On 17 Jun 2016, at 0:23, Joakim via digitalmars-d-ldc wrote:
 Why including the first assert combines with some optimization 
 pass and inlining to produce this junk instruction instead, I 
 don't know.  I don't think this is something that needs to be 
 fixed on the ldc end, but who knows.  Anyone have any tips on 
 tracking this down?
Without having looked at the details of what's going on here in particular, my first step would be to use the "-print-after-all" option and look where the part that you think is clearly wrong first appears (e.g. just by textual search). Having a look at what the input to that pass was (i.e. the previous pass result) can often yield extra insight.
Hmm, I had tried -print-after-all with --debug-pass=Executions before and those flags dumped so much info that I didn't go through it much. Using -print-after-all alone is better: I see that the pass that muffs it up is "Global Value Numbering." If I disable that pass in llvm's PassManagerBuilder, the problem goes away. Looks like a regression in that llvm pass, I'll try building earlier versions of llvm to see if they had the same problem.
 As a general comment, it is of course possible that we are 
 hitting a genuine LLVM bug here, but I'd also be on the lookout 
 for cases where we might be generating invalid IR (in the sense 
 that we trigger some documented undefined behaviour, violate 
 some assumptions, etc.).
Given it works fine for other combinations of optimizations, I doubt it's a problem on our end, but I don't know enough about llvm's assumptions to say for certain. On Friday, 17 June 2016 at 15:51:40 UTC, Dan Olson wrote:
 Joakim <dlang joakim.fea.st> writes:

 Since the update to ddmdfe 2.070, a single assert trips up on 
 Android/ARM when running the druntime/phobos tests:

 https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716
Hi Joakim, Target arm-unknown-linux-gnueabihf is doing ok even with v2.071.0 front end. Isn't Android essentially the same, but softfp? Should be able to change triple and compare IR and see what is different between android triple (what is it by the way, arm-linux-android?) and gnueabihf triple. Or maybe it is llvm version? I've only been using 3.5 and 3.6.2 so far.
I get the same problem with that armhf triple and llvm 3.8.0, looks like it may be an llvm regression.
 I just tried both triples with ldc master of a week ago and 
 llvm-3.6.2. IR for each is identical except gnueabihf has stuff 
 for shared lib setup (dso_ctor, etc) which is not in android 
 version.
I'll try building and linking against llvm 3.6 and 3.7 and see if it makes a difference.
Jun 18 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
 On Friday, 17 June 2016 at 08:48:01 UTC, David Nadlinger wrote:
 I just tried both triples with ldc master of a week ago and 
 llvm-3.6.2. IR for each is identical except gnueabihf has 
 stuff for shared lib setup (dso_ctor, etc) which is not in 
 android version.
I'll try building and linking against llvm 3.6 and 3.7 and see if it makes a difference.
I just built and linked against llvm 3.7.1 and the problem goes away, so it appears that it is a regression in the GVN optimization pass with llvm 3.8.0. I'll check if it goes away with llvm master, and file an llvm bug if it doesn't.
Jun 18 2016
parent reply Dan Olson <gorox comcast.net> writes:
Joakim <dlang joakim.fea.st> writes:

 On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
 I just built and linked against llvm 3.7.1 and the problem goes away,
 so it appears that it is a regression in the GVN optimization pass
 with llvm 3.8.0.  I'll check if it goes away with llvm master, and
 file an llvm bug if it doesn't.
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Jun 18 2016
parent reply Dan Olson <gorox comcast.net> writes:
Dan Olson <gorox comcast.net> writes:

 Joakim <dlang joakim.fea.st> writes:

 On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
 I just built and linked against llvm 3.7.1 and the problem goes away,
 so it appears that it is a regression in the GVN optimization pass
 with llvm 3.8.0.  I'll check if it goes away with llvm master, and
 file an llvm bug if it doesn't.
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Yup, fails same way on a Raspberry Pi with 3.8.0.
Jun 18 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Sunday, 19 June 2016 at 06:45:42 UTC, Dan Olson wrote:
 Dan Olson <gorox comcast.net> writes:

 Joakim <dlang joakim.fea.st> writes:

 On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
 I just built and linked against llvm 3.7.1 and the problem 
 goes away,
 so it appears that it is a regression in the GVN optimization 
 pass
 with llvm 3.8.0.  I'll check if it goes away with llvm 
 master, and
 file an llvm bug if it doesn't.
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Yup, fails same way on a Raspberry Pi with 3.8.0.
And it's still hitting on llvm 3.9 master svn-271934 from a couple weeks ago, looks like we'll have to submit a bug report. Anyone know if we're breaking some llvm assumptions here?
Jun 18 2016
next sibling parent David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc puremagic.com> writes:
On 19 Jun 2016, at 7:58, Joakim via digitalmars-d-ldc wrote:
 Anyone know if we're breaking some llvm assumptions here?
I'm not aware of any – if that is the case (which is quite likely), then it's of course a bug in our code gen. What I'd be inclined to do is to have a manual look at the IR immediately before GVN and see whether there is anything fishy to spot there already. I might also use a debugger on GVN to see exactly why it thinks it can perform a certain replacement – the LLVM pass internals are usually fairly well documented, although GVN is of course algorithmically a bit cumbersome to trace. — David
Jun 20 2016
prev sibling next sibling parent Dan Olson <gorox comcast.net> writes:
Joakim <dlang joakim.fea.st> writes:

 On Sunday, 19 June 2016 at 06:45:42 UTC, Dan Olson wrote:
 Dan Olson <gorox comcast.net> writes:

 Joakim <dlang joakim.fea.st> writes:

 On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
 I just built and linked against llvm 3.7.1 and the problem goes
 away,
 so it appears that it is a regression in the GVN optimization pass
 with llvm 3.8.0.  I'll check if it goes away with llvm master, and
 file an llvm bug if it doesn't.
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Yup, fails same way on a Raspberry Pi with 3.8.0.
And it's still hitting on llvm 3.9 master svn-271934 from a couple weeks ago, looks like we'll have to submit a bug report. Anyone know if we're breaking some llvm assumptions here?
Most times when I found an llvm "bug", it turned out clang was generating slightly different IR than LDC. An example is using "byval" only for aggregates > 64-bytes. Only way to know these rules is looking at clang source. Currently, LDC abi-arm.cpp isn't following all rules that clang uses. My hunch is there is another. -- Dan
Jun 20 2016
prev sibling parent reply Rainer Schuetze <r.sagitario gmx.de> writes:
On 19.06.2016 08:58, Joakim wrote:
 On Sunday, 19 June 2016 at 06:45:42 UTC, Dan Olson wrote:
 Dan Olson <gorox comcast.net> writes:

 Joakim <dlang joakim.fea.st> writes:

 On Saturday, 18 June 2016 at 15:31:03 UTC, Joakim wrote:
 I just built and linked against llvm 3.7.1 and the problem goes away,
 so it appears that it is a regression in the GVN optimization pass
 with llvm 3.8.0.  I'll check if it goes away with llvm master, and
 file an llvm bug if it doesn't.
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Yup, fails same way on a Raspberry Pi with 3.8.0.
And it's still hitting on llvm 3.9 master svn-271934 from a couple weeks ago, looks like we'll have to submit a bug report. Anyone know if we're breaking some llvm assumptions here?
If it fails in a specific optimization pass, it has already passed a number of verification passes. When I posted some bug reports, the LLVM people (who have been very helpful) considered it an LLVM bug if an invalid IR passes the verifier. So even if some assumptions are broken, I would suggest to report to the LLVM bug tracker.
Jun 20 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Monday, 20 June 2016 at 20:15:52 UTC, Rainer Schuetze wrote:
 On 19.06.2016 08:58, Joakim wrote:
 On Sunday, 19 June 2016 at 06:45:42 UTC, Dan Olson wrote:
 Dan Olson <gorox comcast.net> writes:

 Joakim <dlang joakim.fea.st> writes:

 [...]
And I am trying opposite: builting ldc master against llvm 3.8.0. See if I can duplicate the problem.
Yup, fails same way on a Raspberry Pi with 3.8.0.
And it's still hitting on llvm 3.9 master svn-271934 from a couple weeks ago, looks like we'll have to submit a bug report. Anyone know if we're breaking some llvm assumptions here?
If it fails in a specific optimization pass, it has already passed a number of verification passes. When I posted some bug reports, the LLVM people (who have been very helpful) considered it an LLVM bug if an invalid IR passes the verifier. So even if some assumptions are broken, I would suggest to report to the LLVM bug tracker.
OK, will do. I think the fact that the exact same ldc frontend linked with llvm 3.7.1 doesn't have this problem, only when linked against llvm 3.8.0, indicates this most likely isn't a problem on our end.
Jun 20 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Tuesday, 21 June 2016 at 05:51:56 UTC, Joakim wrote:
 On Monday, 20 June 2016 at 20:15:52 UTC, Rainer Schuetze wrote:
 On 19.06.2016 08:58, Joakim wrote:
 [...]
If it fails in a specific optimization pass, it has already passed a number of verification passes. When I posted some bug reports, the LLVM people (who have been very helpful) considered it an LLVM bug if an invalid IR passes the verifier. So even if some assumptions are broken, I would suggest to report to the LLVM bug tracker.
OK, will do. I think the fact that the exact same ldc frontend linked with llvm 3.7.1 doesn't have this problem, only when linked against llvm 3.8.0, indicates this most likely isn't a problem on our end.
Filed: https://llvm.org/bugs/show_bug.cgi?id=28224
Jun 21 2016
parent reply Dan Olson <gorox concast.net> writes:
On Tuesday, 21 June 2016 at 07:00:35 UTC, Joakim wrote:
 On Tuesday, 21 June 2016 at 05:51:56 UTC, Joakim wrote:
 On Monday, 20 June 2016 at 20:15:52 UTC, Rainer Schuetze wrote:
 On 19.06.2016 08:58, Joakim wrote:
 [...]
If it fails in a specific optimization pass, it has already passed a number of verification passes. When I posted some bug reports, the LLVM people (who have been very helpful) considered it an LLVM bug if an invalid IR passes the verifier. So even if some assumptions are broken, I would suggest to report to the LLVM bug tracker.
OK, will do. I think the fact that the exact same ldc frontend linked with llvm 3.7.1 doesn't have this problem, only when linked against llvm 3.8.0, indicates this most likely isn't a problem on our end.
Filed: https://llvm.org/bugs/show_bug.cgi?id=28224
Hmmm. I looked at failing code in std.conv. I think it does a negative shift, which is undefined in C. It is the opSlice() method.
Jun 21 2016
next sibling parent reply Dan Olson <gorox concast.net> writes:
On Tuesday, 21 June 2016 at 19:41:59 UTC, Dan Olson wrote:
 Hmmm.  I looked at failing code in std.conv.  I think it does a 
 negative shift, which is undefined in C.  It is the opSlice() 
 method.
Or rather with unsigned types, a really big shift, also undefined.
Jun 21 2016
parent reply Dan Olson <gorox comcast.net> writes:
Dan Olson <gorox concast.net> writes:

 On Tuesday, 21 June 2016 at 19:41:59 UTC, Dan Olson wrote:
 Hmmm.  I looked at failing code in std.conv.  I think it does a
 negative shift, which is undefined in C.  It is the opSlice()
 method.
Or rather with unsigned types, a really big shift, also undefined.
std.conv.toChars() just has a bug and LLVM 3.8 is behaving ok with an undefined operation. The unittest for toChars isn't very exhaustive for opSlice and gets lucky on Intel targets. LDC with LLVM 3.5 also gets lucky on the limited unittest but toChars() clearly gives wrong results with better test case. Simple fix, following line should not have `- 1`: https://github.com/ldc-developers/phobos/blob/ldc/std/conv.d#L5700 I fixed locally and std.conv unittest is happy now. I will run full test suite on arm to make doubly sure, but I think this will end up as a phobos bug and PR. -- Dan
Jun 21 2016
parent reply Dan Olson <gorox comcast.net> writes:
Dan Olson <gorox comcast.net> writes:

 Dan Olson <gorox concast.net> writes:

 On Tuesday, 21 June 2016 at 19:41:59 UTC, Dan Olson wrote:
 Hmmm.  I looked at failing code in std.conv.  I think it does a
 negative shift, which is undefined in C.  It is the opSlice()
 method.
Or rather with unsigned types, a really big shift, also undefined.
std.conv.toChars() just has a bug and LLVM 3.8 is behaving ok with an undefined operation. The unittest for toChars isn't very exhaustive for opSlice and gets lucky on Intel targets. LDC with LLVM 3.5 also gets lucky on the limited unittest but toChars() clearly gives wrong results with better test case. Simple fix, following line should not have `- 1`: https://github.com/ldc-developers/phobos/blob/ldc/std/conv.d#L5700 I fixed locally and std.conv unittest is happy now. I will run full test suite on arm to make doubly sure, but I think this will end up as a phobos bug and PR.
Testsuite passes with LLVM 3.8.0 now. Filed phobos bug: https://issues.dlang.org/show_bug.cgi?id=16192 Should probably make an LDC Issue to track and pick phobos fix.
Jun 22 2016
parent kink <noone nowhere.com> writes:
On Wednesday, 22 June 2016 at 07:23:54 UTC, Dan Olson wrote:
 Testsuite passes with LLVM 3.8.0 now.
Thx Dan, great stuff.
Jun 22 2016
prev sibling parent reply Joakim <dlang joakim.fea.st> writes:
On Tuesday, 21 June 2016 at 19:41:59 UTC, Dan Olson wrote:
 On Tuesday, 21 June 2016 at 07:00:35 UTC, Joakim wrote:
 On Tuesday, 21 June 2016 at 05:51:56 UTC, Joakim wrote:
 On Monday, 20 June 2016 at 20:15:52 UTC, Rainer Schuetze 
 wrote:
 On 19.06.2016 08:58, Joakim wrote:
 [...]
If it fails in a specific optimization pass, it has already passed a number of verification passes. When I posted some bug reports, the LLVM people (who have been very helpful) considered it an LLVM bug if an invalid IR passes the verifier. So even if some assumptions are broken, I would suggest to report to the LLVM bug tracker.
OK, will do. I think the fact that the exact same ldc frontend linked with llvm 3.7.1 doesn't have this problem, only when linked against llvm 3.8.0, indicates this most likely isn't a problem on our end.
Filed: https://llvm.org/bugs/show_bug.cgi?id=28224
Hmmm. I looked at failing code in std.conv. I think it does a negative shift, which is undefined in C. It is the opSlice() method.
I just tried the three-line reduced test case on linux/x64 with the official ldc 1.0.0 release linked against 3.8.0 and it asserts there too. However, if I compile all of the std.conv unittests and run them all, they all pass. It appears that you've found a clear off-by-one bug. I didn't investigate our end because the tests were all passing otherwise- I now see that dmd doesn't assert even against the reduced test case- but I should have examined that code more closely. It would have helped if llvm and/or the ddmd frontend had flagged that Phobos was shifting by a negative number, rather than passing it through silently for either correct or junk codegen. I agree that the real fix for this bug should go in Phobos, but perhaps we can also improve the spots where this negative shift was most silently discarded so far, whether ddmd or llvm.
Jun 22 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Wednesday, 22 June 2016 at 08:54:22 UTC, Joakim wrote:
 On Tuesday, 21 June 2016 at 19:41:59 UTC, Dan Olson wrote:
 On Tuesday, 21 June 2016 at 07:00:35 UTC, Joakim wrote:
 On Tuesday, 21 June 2016 at 05:51:56 UTC, Joakim wrote:
 On Monday, 20 June 2016 at 20:15:52 UTC, Rainer Schuetze 
 wrote:
 On 19.06.2016 08:58, Joakim wrote:
 [...]
If it fails in a specific optimization pass, it has already passed a number of verification passes. When I posted some bug reports, the LLVM people (who have been very helpful) considered it an LLVM bug if an invalid IR passes the verifier. So even if some assumptions are broken, I would suggest to report to the LLVM bug tracker.
OK, will do. I think the fact that the exact same ldc frontend linked with llvm 3.7.1 doesn't have this problem, only when linked against llvm 3.8.0, indicates this most likely isn't a problem on our end.
Filed: https://llvm.org/bugs/show_bug.cgi?id=28224
Hmmm. I looked at failing code in std.conv. I think it does a negative shift, which is undefined in C. It is the opSlice() method.
I just tried the three-line reduced test case on linux/x64 with the official ldc 1.0.0 release linked against 3.8.0 and it asserts there too. However, if I compile all of the std.conv unittests and run them all, they all pass. It appears that you've found a clear off-by-one bug. I didn't investigate our end because the tests were all passing otherwise- I now see that dmd doesn't assert even against the reduced test case- but I should have examined that code more closely. It would have helped if llvm and/or the ddmd frontend had flagged that Phobos was shifting by a negative number, rather than passing it through silently for either correct or junk codegen. I agree that the real fix for this bug should go in Phobos, but perhaps we can also improve the spots where this negative shift was most silently discarded so far, whether ddmd or llvm.
I experimented a bit with this, using the latest dmd 2.071.0 release for linux/x64 and the following test that mimics what std.conv was doing wrong: int check_shift(int x) { return 16 >>> ((2 - x) * 4);} unittest { assert(check_shift(3) == 16); //assert(( 16 >>> (2-3) * 4) == 16); } If the second assert isn't commented out, dmd always evaluates that expression at compile-time and gives this error: shift.d(5): Error: shift by -4 is outside the range 0..31 If it's left commented out and the file is compiled with this command, ./2.071.0/linux/bin64/dmd -O -unittest -main shift.d the resulting binary asserts at runtime on line 4, ie the first assert. If I compile again with inlining, ./2.071.0/linux/bin64/dmd -O -inline -unittest -main shift.d the test passes! The codegen for dmd seems all over the place here.
Jun 22 2016
parent reply Dan Olson <gorox comcast.net> writes:
Joakim <dlang joakim.fea.st> writes:
 shift.d(5): Error: shift by -4 is outside the range 0..31
Agree, should be explored more to understand why that error message didn't appear in this case.
Jun 22 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Wednesday, 22 June 2016 at 16:06:13 UTC, Dan Olson wrote:
 Joakim <dlang joakim.fea.st> writes:
 shift.d(5): Error: shift by -4 is outside the range 0..31
Agree, should be explored more to understand why that error message didn't appear in this case.
I looked into this and it appears that dmd doesn't do the static analysis necessary to catch such bugs nor insert the runtime checks that could bail out with an exact error: https://issues.dlang.org/show_bug.cgi?id=7604 https://issues.dlang.org/show_bug.cgi?id=4835 Walter says he doesn't want runtime checks for performance reasons (comment 11 in the second linked bug), which is understandable. I'm not sure we can do anything but be more careful about such bugs in our D code. I've filed a dmd bug for the unrelated optimization issue I ran into above: https://issues.dlang.org/show_bug.cgi?id=16217
Jun 29 2016
next sibling parent reply Johan Engelen <j j.nl> writes:
On Wednesday, 29 June 2016 at 08:18:46 UTC, Joakim wrote:
 On Wednesday, 22 June 2016 at 16:06:13 UTC, Dan Olson wrote:
 Joakim <dlang joakim.fea.st> writes:
 shift.d(5): Error: shift by -4 is outside the range 0..31
Agree, should be explored more to understand why that error message didn't appear in this case.
I looked into this and it appears that dmd doesn't do the static analysis necessary to catch such bugs nor insert the runtime checks that could bail out with an exact error: https://issues.dlang.org/show_bug.cgi?id=7604 https://issues.dlang.org/show_bug.cgi?id=4835 Walter says he doesn't want runtime checks for performance reasons (comment 11 in the second linked bug), which is understandable. I'm not sure we can do anything but be more careful about such bugs in our D code.
This is LDC, not DMD, so for this it doesn't matter what Walter says :-) If you can implement a runtime check (default off) for undefined shifts, how is that bad? Clang's -fsanitize flags look great. http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html cheers, Johan
Jun 29 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Wednesday, 29 June 2016 at 15:09:33 UTC, Johan Engelen wrote:
 On Wednesday, 29 June 2016 at 08:18:46 UTC, Joakim wrote:
 On Wednesday, 22 June 2016 at 16:06:13 UTC, Dan Olson wrote:
 Joakim <dlang joakim.fea.st> writes:
 shift.d(5): Error: shift by -4 is outside the range 0..31
Agree, should be explored more to understand why that error message didn't appear in this case.
I looked into this and it appears that dmd doesn't do the static analysis necessary to catch such bugs nor insert the runtime checks that could bail out with an exact error: https://issues.dlang.org/show_bug.cgi?id=7604 https://issues.dlang.org/show_bug.cgi?id=4835 Walter says he doesn't want runtime checks for performance reasons (comment 11 in the second linked bug), which is understandable. I'm not sure we can do anything but be more careful about such bugs in our D code.
This is LDC, not DMD, so for this it doesn't matter what Walter says :-) If you can implement a runtime check (default off) for undefined shifts, how is that bad? Clang's -fsanitize flags look great. http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
I'll look into what clang's doing there. Speaking of shift errors, I took a look at why even ldc 1.0.0 fails for that undefined shift optimization bug I just reported for dmd and it does the same thing, simply removes the test block after inlining and optimization. Here's the relevant IR before and after inlining with -O1 -enable-inlining on linux/x64: *** IR Dump Before Function Integration/Inlining *** ; Function Attrs: uwtable define void _D5shift14__unittestL2_1FZv() #1 comdat { %1 = call i32 _D5shift11check_shiftFiZi(i32 3) #1 %2 = icmp eq i32 %1, 16 br i1 %2, label %assertPassed, label %assertFailed *** IR Dump After Function Integration/Inlining *** ; Function Attrs: uwtable define void _D5shift14__unittestL2_1FZv() #1 comdat { %1 = icmp eq i32 undef, 16 br i1 %1, label %assertPassed, label %assertFailed Eventually that gets optimized away to nothing, just like with dmd. Would this be considered an llvm bug or the expected result from passing an undefined right shift to llvm? It should at least warn about that undef after inlining, shouldn't it?
Jun 30 2016
parent reply Johan Engelen <j j.nl> writes:
On Thursday, 30 June 2016 at 11:50:15 UTC, Joakim wrote:
 Speaking of shift errors, I took a look at why even ldc 1.0.0 
 fails for that undefined shift optimization bug I just reported 
 for dmd and it does the same thing, simply removes the test 
 block after inlining and optimization.

 Here's the relevant IR before and after inlining with -O1 
 -enable-inlining on linux/x64:

 *** IR Dump Before Function Integration/Inlining ***
 ; Function Attrs: uwtable
 define void  _D5shift14__unittestL2_1FZv() #1 comdat {
   %1 = call i32  _D5shift11check_shiftFiZi(i32 3) #1
   %2 = icmp eq i32 %1, 16
   br i1 %2, label %assertPassed, label %assertFailed

 *** IR Dump After Function Integration/Inlining ***
 ; Function Attrs: uwtable
 define void  _D5shift14__unittestL2_1FZv() #1 comdat {
 %1 = icmp eq i32 undef, 16
 br i1 %1, label %assertPassed, label %assertFailed

 Eventually that gets optimized away to nothing, just like with 
 dmd.

 Would this be considered an llvm bug or the expected result 
 from passing an undefined right shift to llvm?  It should at 
 least warn about that undef after inlining, shouldn't it?
I am not a big LLVM expert, but I would not expect a warning at all. I think "undef" is used e.g. by frontends (e.g. LDC) to allow LLVM to optimize certain constructs. I'm guessing LLVM is thinking: "undef", great!, let's optimize the hell out of it!
Jun 30 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Thursday, 30 June 2016 at 12:33:21 UTC, Johan Engelen wrote:
 On Thursday, 30 June 2016 at 11:50:15 UTC, Joakim wrote:
 Speaking of shift errors, I took a look at why even ldc 1.0.0 
 fails for that undefined shift optimization bug I just 
 reported for dmd and it does the same thing, simply removes 
 the test block after inlining and optimization.

 Here's the relevant IR before and after inlining with -O1 
 -enable-inlining on linux/x64:

 *** IR Dump Before Function Integration/Inlining ***
 ; Function Attrs: uwtable
 define void  _D5shift14__unittestL2_1FZv() #1 comdat {
   %1 = call i32  _D5shift11check_shiftFiZi(i32 3) #1
   %2 = icmp eq i32 %1, 16
   br i1 %2, label %assertPassed, label %assertFailed

 *** IR Dump After Function Integration/Inlining ***
 ; Function Attrs: uwtable
 define void  _D5shift14__unittestL2_1FZv() #1 comdat {
 %1 = icmp eq i32 undef, 16
 br i1 %1, label %assertPassed, label %assertFailed

 Eventually that gets optimized away to nothing, just like with 
 dmd.

 Would this be considered an llvm bug or the expected result 
 from passing an undefined right shift to llvm?  It should at 
 least warn about that undef after inlining, shouldn't it?
I am not a big LLVM expert, but I would not expect a warning at all. I think "undef" is used e.g. by frontends (e.g. LDC) to allow LLVM to optimize certain constructs. I'm guessing LLVM is thinking: "undef", great!, let's optimize the hell out of it!
I assumed that undef was some kind of poison value, but it looks like it's normally used: http://www.playingwithpointers.com/problem-with-undef.html I guess the issue then is whether the inlining pass should just be returning undef from evaluating check_shift(3), and moving on happily from there. Since this is at compile-time, I don't think it should. Does anyone know what llvm's stance is on this? Are we supposed to be running sanitizers or something else to avoid these bugs?
Jun 30 2016
parent reply David Nadlinger via digitalmars-d-ldc <digitalmars-d-ldc puremagic.com> writes:
On 30 Jun 2016, at 16:40, Joakim via digitalmars-d-ldc wrote:
 I assumed that undef was some kind of poison value
undef is indeed "some kind of poison value", in that each use of it evaluates to a (potentially different) arbitrary bit string. By itself, using an undef isn't undefined behaviour, but of course for many operations it ultimately is, because there are bit string inputs for which these operations are undefined (e.g. loads, stores). LLVM knows a concept called "poison values" too, which are undefs with slightly stronger semantics produced by C-style signed integer arithmetic overflow and similar operations – in loose terms, any operation that depends on them in an externally visible way has undefined behaviour. I usually find the LLVM language reference (http://llvm.org/docs/LangRef.html) to be quite a clear resource for these sorts of questions.
 [should] the inlining pass […] just be returning undef […]? Since 
 this is at compile-time, I don't think it should. […]
 Are we supposed to be running sanitizers or something else to avoid 
 these bugs?
First off, as it currently stands, this is certainly not an issue in LLVM. The lshr instruction is documented as resulting in undefined behaviour when used with an out-of-range shift. Replacing the whole call with `undef` is thus a valid IR transformation. So far for LLVM working as designed. The question of course becomes whether, being a compiler writer's tool, it would be nice for it to emit a warning on such transformations. And here things suddenly become muddy. Yes, in this case, getting a warning would be useful. However, if the code was not actually reachable dynamically, a warning would be wrong. Of course, this can be solved by offering a way to declare basic blocks/functions to be considered reachable for that purpose, but that introduces extra complexity – I wouldn't be surprised if the fact that you'd need to design something along these lines was the main reason why LLVM does not try to report such conditions. Of course, language frontends can always emit dynamical checks to avoid executing llvm::Instructions with UB-inducing arguments, whether in the form of sanitisers, or by default as part of faithfully lowering their semantics. — David
Jun 30 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Thursday, 30 June 2016 at 18:10:04 UTC, David Nadlinger wrote:
 On 30 Jun 2016, at 16:40, Joakim via digitalmars-d-ldc wrote:
 I assumed that undef was some kind of poison value
undef is indeed "some kind of poison value", in that each use of it evaluates to a (potentially different) arbitrary bit string. By itself, using an undef isn't undefined behaviour, but of course for many operations it ultimately is, because there are bit string inputs for which these operations are undefined (e.g. loads, stores). LLVM knows a concept called "poison values" too, which are undefs with slightly stronger semantics produced by C-style signed integer arithmetic overflow and similar operations – in loose terms, any operation that depends on them in an externally visible way has undefined behaviour. I usually find the LLVM language reference (http://llvm.org/docs/LangRef.html) to be quite a clear resource for these sorts of questions.
OK, I read some IR reference on undef, but hadn't seen that llvm has a separate formal notion of a poison value, because it isn't part of the IR.
 [should] the inlining pass […] just be returning undef […]? 
 Since this is at compile-time, I don't think it should. […]
 Are we supposed to be running sanitizers or something else to 
 avoid these bugs?
First off, as it currently stands, this is certainly not an issue in LLVM. The lshr instruction is documented as resulting in undefined behaviour when used with an out-of-range shift. Replacing the whole call with `undef` is thus a valid IR transformation. So far for LLVM working as designed. The question of course becomes whether, being a compiler writer's tool, it would be nice for it to emit a warning on such transformations. And here things suddenly become muddy. Yes, in this case, getting a warning would be useful.
I tried this with clang, it does exactly the same as ldc. I've appended the output to the bug report: https://llvm.org/bugs/show_bug.cgi?id=28224#c7
 However, if the code was not actually reachable dynamically, a 
 warning would be wrong. Of course, this can be solved by 
 offering a way to declare basic blocks/functions to be 
 considered reachable for that purpose, but that introduces 
 extra complexity – I wouldn't be surprised if the fact that 
 you'd need to design something along these lines was the main 
 reason why LLVM does not try to report such conditions.
I don't know that it matters if the code is "reachable dynamically," ie you mean it isn't actually used at runtime? Most of the time, you'd want to know that llvm is just removing what it considers to be bad code, with no notice whatsoever.
 Of course, language frontends can always emit dynamical checks 
 to avoid executing llvm::Instructions with UB-inducing 
 arguments, whether in the form of sanitisers, or by default as 
 part of faithfully lowering their semantics.
I tried the clang sanitizer, as noted in the bug report: it works, though at runtime. As for the frontend doing it at compile-time, either we build a bunch of static analysis in to catch such bugs or we need some way to work with llvm IR and its optimization passes so it can tell us when such things are happening. I was surprised that llvm is just silently doing this, with no warning.
Jul 02 2016
parent reply Johan Engelen <j j.nl> writes:
On Saturday, 2 July 2016 at 07:29:47 UTC, Joakim wrote:
 
 I was surprised that llvm is just silently doing this, with no 
 warning.
This blog article gives some insight: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html
Jul 02 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Saturday, 2 July 2016 at 07:42:31 UTC, Johan Engelen wrote:
 On Saturday, 2 July 2016 at 07:29:47 UTC, Joakim wrote:
 
 I was surprised that llvm is just silently doing this, with no 
 warning.
This blog article gives some insight: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html
Thanks for the link, it's very relevant. I've mentioned these issues and closed the llvm bug as invalid, as that GVN pass was only surfacing this undefined behavior. Leaving aside the optimizer, the unoptimized phobos debug tests should have caught this, shouldn't they? I'll look into why those passed next.
Jul 04 2016
parent reply Joakim <dlang joakim.fea.st> writes:
On Monday, 4 July 2016 at 09:21:19 UTC, Joakim wrote:
 On Saturday, 2 July 2016 at 07:42:31 UTC, Johan Engelen wrote:
 On Saturday, 2 July 2016 at 07:29:47 UTC, Joakim wrote:
 
 I was surprised that llvm is just silently doing this, with 
 no warning.
This blog article gives some insight: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html
Thanks for the link, it's very relevant. I've mentioned these issues and closed the llvm bug as invalid, as that GVN pass was only surfacing this undefined behavior. Leaving aside the optimizer, the unoptimized phobos debug tests should have caught this, shouldn't they? I'll look into why those passed next.
Alright, tracked this down, it was a test that had a bad failure mode. I'll write it up in the General forum. Johan, you mentioned this other shift bug that you ran into: https://github.com/dlang/phobos/pull/4509 Was that just one that you had hit recently or did you go checking all 20+ uses of logical right shift in Phobos to look for bugs?
Jul 06 2016
parent Johan Engelen <j j.nl> writes:
On Wednesday, 6 July 2016 at 18:00:40 UTC, Joakim wrote:
 Johan, you mentioned this other shift bug that you ran into:

 https://github.com/dlang/phobos/pull/4509

 Was that just one that you had hit recently or did you go 
 checking all 20+ uses of logical right shift in Phobos to look 
 for bugs?
Yeah that's the one I wrote about before. I discovered it after a long time debugging :( I did check the other >>>= in Phobos and they looked OK, but I did not spend much time on checking them.
Jul 06 2016
prev sibling parent Johan Engelen <j j.nl> writes:
Just now I discovered that the optimization bug in my inlining 
branch is exactly this: a shift outside the defined range 
happens, and LLVM happily optimizes the program using the 
undefined behavior and terrible things happen.

A runtime check would have been a great help here, saving me 
hours of bug hunting.

The bug is in std.range.primitives, BitsSet(T)::popFront():
```
     void popFront()
     {
         assert(_value, "Cannot call popFront on empty range.");

         _value >>>= 1;
         uint n = countTrailingZeros(_value); // returns 
sizeof(_value) * 8 when _value==0
         _value >>>= n; // BOOM!
```
Jun 30 2016
prev sibling parent Dan Olson <gorox comcast.net> writes:
Joakim <dlang joakim.fea.st> writes:

 Since the update to ddmdfe 2.070, a single assert trips up on
 Android/ARM when running the druntime/phobos tests:

 https://github.com/dlang/phobos/blob/v2.070.2/std/conv.d#L5716
Hi Joakim, Target arm-unknown-linux-gnueabihf is doing ok even with v2.071.0 front end. Isn't Android essentially the same, but softfp? Should be able to change triple and compare IR and see what is different between android triple (what is it by the way, arm-linux-android?) and gnueabihf triple. Or maybe it is llvm version? I've only been using 3.5 and 3.6.2 so far. I just tried both triples with ldc master of a week ago and llvm-3.6.2. IR for each is identical except gnueabihf has stuff for shared lib setup (dso_ctor, etc) which is not in android version. -- Dan
Jun 17 2016