www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 6215] New: ICE(el.c) DMD segfaults when built on system with XCode 4.2

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215

           Summary: ICE(el.c) DMD segfaults when built on system with
                    XCode 4.2
           Product: D
           Version: D1 & D2
          Platform: Other
        OS/Version: Mac OS X
            Status: NEW
          Severity: blocker
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: robert octarineparrot.com



14:28:18 BST ---
As of XCode 4.2 (maybe 4.1, I skipped it), Apple has made gcc a symlink for
llvm-gcc. When built using llvm-gcc, dmd sefaults in el.c:211:

void test(){}
void main(){}

$ dmd test.d
Segmentation fault

It fails in el.c:211.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 26 2011
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215


Jacob Carlborg <doob me.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |doob me.com



GCC is a symlink for LLVM-GCC with XCode 4.1 as well.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 26 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




19:33:46 BST ---
The following patch is a workaround, it seems something's going wrong with the
elem recycling system:
----
diff --git a/src/backend/el.c b/src/backend/el.c
index f5fa66d..9cc34fc 100644
--- a/src/backend/el.c
+++ b/src/backend/el.c
   -195,6 +195,7    elem *el_calloc()
     static elem ezero;

     elcount++;
+#if 0
     if (nextfree)
     {   e = nextfree;
         nextfree = e->E1;
   -209,6 +210,9    elem *el_calloc()
     eprm_cnt++;
 #endif
     *e = ezero;                         /* clear it             */
+#else
+    e = (elem *)mem_fmalloc(sizeof(elem));
+#endif

 #ifdef DEBUG
     e->id = IDelem;
----
If you print e and *e, *e is NULL, hence the segfault when assigned to.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jun 26 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215


Sean Kelly <sean invisibleduck.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sean invisibleduck.org



---
The first problem is in el_calloc():

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 13 at address: 0x00000000
0x0009fc2f in el_calloc () at el.c:189
189        *e = ezero;                         /* clear it             */
(gdb) p e
$1 = (elem *) 0xa4e7bc
Current language:  auto; currently c++
(gdb) 

I can't explain why this isn't working, but it's easily fixed by replacing the
assignment with:

memset(e, 0, sizeof(elem));

That gets us to the next error:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: 13 at address: 0x00000000
0x000aa08d in evalu8 (e=0xa50988) at evalu8.c:625
625            esave = *e;
(gdb) p e
$1 = (elem *) 0xa50988
Current language:  auto; currently c++
(gdb) 

Which is similarly fixed by replacing the assignment with:

memcpy(&esave, e, sizeof(elem));

These two changes are enough for LLVM-DMD to build druntime.

Given the two errors above, the problem seems to be with the default assignment
operator LLVM generates for the elem struct.  It's a very weird problem though.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215


Brad Roberts <braddr puremagic.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |braddr puremagic.com



---
Would you take that info and try the same sort of code in a standalone test
case?  If struct assignment is indeed the problem, that's a pretty embarrassing
llvm bug, imho, and clearly should be reported to either llvm directly or apple
as the provider of that version of the compiler.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215


klickverbot <code klickverbot.at> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |code klickverbot.at



---

 Given the two errors above, the problem seems to be with the default assignment
 operator LLVM generates for the elem struct.  It's a very weird problem though.
Note that at least the first error happens with both LLVM-GCC, which uses the GCC frontend, and Clang. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




---
The difference in the LLVM IR generated by Clang for the ezero change is only:

-  call void  llvm.memcpy.p0i8.p0i8.i32(i8* %17, i8* getelementptr inbounds
(%struct.elem*  _ZZ9el_callocvE5ezero, i32 0, i32 0), i32 80, i32 16, i1 false)
+  call void  llvm.memset.p0i8.i32(i8* %17, i8 0, i32 80, i32 1, i1 false)

Note that the second last parameter to memcpy is the alignment (16 bit), but
GDB says that »(int)e % 16« is 8.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




---
And indeed, __alignof__(*e) gives 16, patching the allocator to 16-byte align
everything is easy:

--- a/src/tk/mem.c
+++ b/src/tk/mem.c
   -758,7 +758,7    void *mem_fmalloc(unsigned numbytes)
     if (sizeof(size_t) == 2)
         numbytes = (numbytes + 1) & ~1;         /* word align   */
     else
-        numbytes = (numbytes + 3) & ~3;         /* dword align  */
+        numbytes = (numbytes + 15) & ~15;

     /* This ugly flow-of-control is so that the most common case
        drops straight through.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




---
A preliminary patch which only 16 byte aligns allocations when building with a
LLVM backend is at: https://github.com/D-Programming-Language/dmd/pull/301.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




Is this specific to Mac OS X or is it like this with LLVM in general?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




---

 Is this specific to Mac OS X or is it like this with LLVM in general?
Happens on my Linux x86_64 box too. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




PDT ---
Awesome.  I figured it was an alignment mistake for the copy, but ran out of
time to investigate.  What an embarrassing bug for LLVM.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2011
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215




---

 Awesome.  I figured it was an alignment mistake for the copy, but ran out of
 time to investigate.  What an embarrassing bug for LLVM.
Just for clarity, let me note that this is definitely _not_ a bug in LLVM, it just happens with two compilers using LLVM as their backend. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2011
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=6215


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |FIXED



18:28:32 PDT ---
https://github.com/D-Programming-Language/dmd/commit/77838ef515fa69bf7f379d3ff6cff0b19b49577e

https://github.com/D-Programming-Language/dmd/commit/06d04bf519f7103ab5a39dff0f863979dbdc8bd2

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2011