www.digitalmars.com         C & C++   DMDScript  

D.gnu - [Bug 99] New: -fsection-anchors broken on ARM

http://bugzilla.gdcproject.org/show_bug.cgi?id=99


           Summary: -fsection-anchors broken on ARM
    Classification: Unclassified
           Product: GDC
           Version: development
          Platform: ARM
        OS/Version: Other
            Status: NEW
          Severity: major
          Priority: Normal
         Component: libgdruntime
        AssignedTo: ibuclaw gdcproject.org
        ReportedBy: slavo5150 yahoo.com


Created attachment 58
  --> http://bugzilla.gdcproject.org/attachment.cgi?id=58
issue120.patch

This was migrated from
https://bitbucket.org/goshawk/gdc/issue/120/fsection-anchors-broken-on-arm

Johannes Pfau created an issue 2010-12-04
******************************************
When running a program on ARM, the GC immediately enters an infinite loop.
Debugging shows that the entered loop is in gcx.d line 2179:

for (; p < ptop; p += size)
{
     (cast(List *)p).next = *b;
     *b = cast(List *)p;
}

p and ptop are correct, but size is 0. size is set in line 2174:

size_t size = binsize[bin];

binsize is just an array of uints:

immutable uint binsize[B_MAX] = [ 16,32,64,128,256,512,1024,2048,4096 ];

bin was 3 in my tests, a correct index value. Still the array entry lookup
somehow fails. This only happens with optimization turned on though: If
druntime is compiled with -O0 the gc works fine.

Testing with optimization enabled:

I added code to make a copy of binsize and looked at that copy with gdb:

$1 = {0 <repeats 12 times>}

So I looked at the binsize symbol:

readelf -s -w ./hello_d | grep binsize
  5665: 00050c64    48 OBJECT  GLOBAL DEFAULT   14 _D2gc3gcx7binsizeyG12k

lets have a look at section 14:

readelf -S hello_d
There are 41 section headers, starting at offset 0x175464:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf
Al
  [14] .rodata           PROGBITS        00049db8 041db8 0089d0 00   A  0   0 
8

Dump it:

readelf -x 14 hello_d
...
  0x00050c58 00000000 00000000 00000000 10000000 ................
  0x00050c68 20000000 40000000 80000000 00010000  ... ...........
  0x00050c78 00020000 00040000 00080000 00100000 ................
  0x00050c88 00000000 00000000 00000000 00000000 ................
  0x00050c98 00000000 00000000 00000000 00000000 ................

seems to be correct though.

asm dump of function allocPage:

Dump of assembler code for function _D2gc3gcx3Gcx9allocPageMFhZi:
0x00042c58 <_D2gc3gcx3Gcx9allocPageMFhZi+0>:    push    {r4, r5, r6, r7, r8,
lr}

0x00042c60 <_D2gc3gcx3Gcx9allocPageMFhZi+8>:    mov    r5, r0

0x00042c68 <_D2gc3gcx3Gcx9allocPageMFhZi+16>:    mov    r7, r1
0x00042c6c <_D2gc3gcx3Gcx9allocPageMFhZi+20>:    beq    0x42d08
<_D2gc3gcx3Gcx9allocPageMFhZi+176>

0x00042c74 <_D2gc3gcx3Gcx9allocPageMFhZi+28>:    b    0x42c84
<_D2gc3gcx3Gcx9allocPageMFhZi+44>

0x00042c7c <_D2gc3gcx3Gcx9allocPageMFhZi+36>:    cmp    r3, r4
0x00042c80 <_D2gc3gcx3Gcx9allocPageMFhZi+40>:    bls    0x42d08
<_D2gc3gcx3Gcx9allocPageMFhZi+176>



0x00042c90 <_D2gc3gcx3Gcx9allocPageMFhZi+56>:    add    r4, r4, r1
0x00042c94 <_D2gc3gcx3Gcx9allocPageMFhZi+60>:    mov    r0, r6
0x00042c98 <_D2gc3gcx3Gcx9allocPageMFhZi+64>:    bl    0x42bfc
<_D2gc3gcx4Pool10allocPagesMFkZk>

0x00042ca0 <_D2gc3gcx3Gcx9allocPageMFhZi+72>:    beq    0x42c78
<_D2gc3gcx3Gcx9allocPageMFhZi+32>


0x42d10 <_D2gc3gcx3Gcx9allocPageMFhZi+184>
0x00042cac <_D2gc3gcx3Gcx9allocPageMFhZi+84>:    strb    r7, [r3, r0]
0x00042cb0 <_D2gc3gcx3Gcx9allocPageMFhZi+88>:    ldr    r3, [r6]



0x00042cc0 <_D2gc3gcx3Gcx9allocPageMFhZi+104>:    add    r12, r0, r2


0x1000

0x00042cd0 <_D2gc3gcx3Gcx9allocPageMFhZi+120>:    mov    r1, r12
0x00042cd4 <_D2gc3gcx3Gcx9allocPageMFhZi+124>:    b    0x42ce0
<_D2gc3gcx3Gcx9allocPageMFhZi+136>
0x00042cd8 <_D2gc3gcx3Gcx9allocPageMFhZi+128>:    mov    r4, r3
0x00042cdc <_D2gc3gcx3Gcx9allocPageMFhZi+132>:    add    r12, r12, r2
0x00042ce0 <_D2gc3gcx3Gcx9allocPageMFhZi+136>:    add    r1, r1, r2
0x00042ce4 <_D2gc3gcx3Gcx9allocPageMFhZi+140>:    rsb    r3, r2, r1
0x00042ce8 <_D2gc3gcx3Gcx9allocPageMFhZi+144>:    cmp    r6, r3
0x00042cec <_D2gc3gcx3Gcx9allocPageMFhZi+148>:    str    r4, [r0]
0x00042cf0 <_D2gc3gcx3Gcx9allocPageMFhZi+152>:    mov    r3, r0
0x00042cf4 <_D2gc3gcx3Gcx9allocPageMFhZi+156>:    str    r0, [r7]
0x00042cf8 <_D2gc3gcx3Gcx9allocPageMFhZi+160>:    mov    r0, r12
0x00042cfc <_D2gc3gcx3Gcx9allocPageMFhZi+164>:    bhi    0x42cd8
<_D2gc3gcx3Gcx9allocPageMFhZi+128>

0x00042d04 <_D2gc3gcx3Gcx9allocPageMFhZi+172>:    pop    {r4, r5, r6, r7, r8,
pc}

0x00042d0c <_D2gc3gcx3Gcx9allocPageMFhZi+180>:    pop    {r4, r5, r6, r7, r8,
pc}
0x00042d10 <_D2gc3gcx3Gcx9allocPageMFhZi+184>:    ldrdeq    r0, [r5], -r4

Testing with -O0

asm dump, compiled with -O0

Dump of assembler code for function _D2gc3gcx3Gcx9allocPageMFhZi:
0x00046eb8 <_D2gc3gcx3Gcx9allocPageMFhZi+0>:    push    {r11, lr}



0x28
0x00046ec8 <_D2gc3gcx3Gcx9allocPageMFhZi+16>:    mov    r3, r1

0x29













0x28


0x00046f0c <_D2gc3gcx3Gcx9allocPageMFhZi+84>:    cmp    r2, r3






0x00046f28 <_D2gc3gcx3Gcx9allocPageMFhZi+112>:    bne    0x46f78
<_D2gc3gcx3Gcx9allocPageMFhZi+192>

0x28

0x54


0x00046f3c <_D2gc3gcx3Gcx9allocPageMFhZi+132>:    add    r3, r2, r3
0x00046f40 <_D2gc3gcx3Gcx9allocPageMFhZi+136>:    ldr    r3, [r3]



0x00046f50 <_D2gc3gcx3Gcx9allocPageMFhZi+152>:    bl    0x4898c
<_D2gc3gcx4Pool10allocPagesMFkZk>
0x00046f54 <_D2gc3gcx3Gcx9allocPageMFhZi+156>:    mov    r3, r0



0x00046f64 <_D2gc3gcx3Gcx9allocPageMFhZi+172>:    bne    0x46f80
<_D2gc3gcx3Gcx9allocPageMFhZi+200>



0x00046f74 <_D2gc3gcx3Gcx9allocPageMFhZi+188>:    b    0x46f00
<_D2gc3gcx3Gcx9allocPageMFhZi+72>

0x00046f7c <_D2gc3gcx3Gcx9allocPageMFhZi+196>:    b    0x4704c
<_D2gc3gcx3Gcx9allocPageMFhZi+404>
0x00046f80 <_D2gc3gcx3Gcx9allocPageMFhZi+200>:    nop            ; (mov r0, r0)


0x58

0x00046f90 <_D2gc3gcx3Gcx9allocPageMFhZi+216>:    add    r3, r2, r3

0x29
0x00046f98 <_D2gc3gcx3Gcx9allocPageMFhZi+224>:    strb    r2, [r3]

0x29


0x47058 <_D2gc3gcx3Gcx9allocPageMFhZi+416>
0x00046fa8 <_D2gc3gcx3Gcx9allocPageMFhZi+240>:    add    r3, r2, r3
0x00046fac <_D2gc3gcx3Gcx9allocPageMFhZi+244>:    ldr    r3, [r3]


0x28


0x29

0x00046fc4 <_D2gc3gcx3Gcx9allocPageMFhZi+268>:    add    r3, r2, r3


0x00046fd0 <_D2gc3gcx3Gcx9allocPageMFhZi+280>:    ldr    r2, [r3]


0x00046fdc <_D2gc3gcx3Gcx9allocPageMFhZi+292>:    add    r3, r2, r3



0x1000



0x00046ff8 <_D2gc3gcx3Gcx9allocPageMFhZi+320>:    cmp    r2, r3






0x00047014 <_D2gc3gcx3Gcx9allocPageMFhZi+348>:    bne    0x47048
<_D2gc3gcx3Gcx9allocPageMFhZi+400>


0x00047020 <_D2gc3gcx3Gcx9allocPageMFhZi+360>:    ldr    r2, [r2]
0x00047024 <_D2gc3gcx3Gcx9allocPageMFhZi+364>:    str    r2, [r3]


0x00047030 <_D2gc3gcx3Gcx9allocPageMFhZi+376>:    str    r2, [r3]


0x0004703c <_D2gc3gcx3Gcx9allocPageMFhZi+388>:    add    r3, r2, r3

0x00047044 <_D2gc3gcx3Gcx9allocPageMFhZi+396>:    b    0x46ff0
<_D2gc3gcx3Gcx9allocPageMFhZi+312>

0x0004704c <_D2gc3gcx3Gcx9allocPageMFhZi+404>:    mov    r0, r3

0x00047054 <_D2gc3gcx3Gcx9allocPageMFhZi+412>:    pop    {r11, pc}


I hope this information can help to fix this bug at some time, I don't
understand the asm well enough to figure the problem out by myself.

Johannes Pfau - 2010-12-04
******************************************
edited description


Iain Buclaw - 2010-12-05
******************************************
* changed status to open

Can you backtrace with argument name/values?

I think this is what I was referring to in the NG thread. What you should see
is a parameter with a corrupt/large value.

Also, shot in the dark, but what are your configure flags? Have you tried
building with flags that closely match your ARM board spec?

ie:

--with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb

Valid arch flags: armv[23456] | armv2a | armv3m | armv4t | armv5t | armv5te |
armv6j | armv6k | armv6z | armv6zk | armv6-m | armv7 | armv7-a | armv7-r |
armv7-m | iwmmxt | ep9312

Valid float flags: soft | hard | softfp

Valid fpu flags: fpa | fpe2 | fpe3 | maverick | vfp | vfp3 | vfpv3 | vfpv3-d16
| neon

Valid mode flags: arm | thumb

If in doubt, probably just use the same flags as what your system gcc was built
with.

Regards


Johannes Pfau - 2010-12-05
******************************************
Is "bt full" with gdb enough? Some values are optimized out, but as the problem
doesn't show with the unoptimized version:
Full Backtrace, optimized version

0  _D2gc3gcx3Gcx9allocPageMFhZi (this=<value optimized out>, 
    bin=<value optimized out>)
    at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:2179
        pn = <value optimized out>
        pool = <value optimized out>
        b = 0x66094
        size = 0
        ptop = 0x40235000 ""
        p = 0x40234000 ""
        n = <value optimized out>
1  0x0004444c in _D2gc3gcx2GC12mallocNoSyncMFkkPkZPv (this=..., size=88, 
    bits=1, alloc_size=<value optimized out>)
    at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:459
        collected = false
        state = 2
        bin = 3 '\003'
        p = <value optimized out>
2  0x000451d4 in _D2gc3gcx2GC6mallocMFkkPkZPv (this=..., size=88, bits=1, 
    alloc_size=0x0) at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:418
        __sync7 =  0x64c0c
3  0x00041d58 in gc_malloc (sz=269656, ba=0)
    at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:188
No locals.
4  0x0003c1f8 in _d_newclass (ci=...)
    at ../../../gcc-4.4.5-build/libphobos/rt/lifetime.d:125
        p = <value optimized out>
5  0x00040c4c in thread_attachThis ()
    at ../../../gcc-4.4.5-build/libphobos/core/thread.d:1813
        thisContext = 0xbee79628
        thisThread = <value optimized out>
6  0x00040da0 in thread_init ()
    at ../../../gcc-4.4.5-build/libphobos/core/thread.d:1792
        sigusr2 = {sa_handler = 0x3fde4 <thread_resumeHandler>, 
          sa_sigaction = 0x3fde4 <thread_resumeHandler>, sa_mask = {__val = {
              2147483647, 4294967294, 4294967295 <repeats 30 times>}}, 
          sa_flags = 0, sa_restorer = 0}
        sigusr1 = {sa_handler = 0x40b94 <thread_suspendHandler>, 
          sa_sigaction = 0x40b94 <thread_suspendHandler>, sa_mask = {__val = {
              2147483647, 4294967294, 4294967295 <repeats 30 times>}}, 
          sa_flags = 268435456, sa_restorer = 0}
7  0x000423ac in gc_init () at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:115
        p = 0x66018
8  0x00045f50 in runAll ()
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:498
No locals.
9  0x00045cec in tryExec (dg=...)
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:438
No locals.
10 0x00045e84 in main (argc=<value optimized out>, argv=0xbee79944)
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:515
        args = {length = 1, ptr = 0x66008}
        result = 0
        trapExceptions = true

Interesting stuff, you are right, in 3 size and ba are wrong, but they are
correct in 2.
Partial backtrace, optimized version, with breakpoints

Breakpoint 1, gc_malloc (sz=88, ba=1)
    at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:188
188    ../../../gcc-4.4.5-build/libphobos/gc/gc.d: No such file or directory.
    in ../../../gcc-4.4.5-build/libphobos/gc/gc.d
Current language:  auto
The current source language is "auto; currently minimal".
(gdb) cont
Continuing.

Breakpoint 2, _D2gc3gcx2GC6mallocMFkkPkZPv (this=..., size=88, bits=1, 
    alloc_size=0x0) at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:406
406    ../../../gcc-4.4.5-build/libphobos/gc/gcx.d: No such file or directory.
    in ../../../gcc-4.4.5-build/libphobos/gc/gcx.d
(gdb) bt
0  _D2gc3gcx2GC6mallocMFkkPkZPv (this=..., size=88, bits=1, alloc_size=0x0)
    at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:406
1  0x00041d58 in gc_malloc (sz=269656, ba=1)
    at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:188
2  0x0003c1f8 in _d_newclass (ci=...)
    at ../../../gcc-4.4.5-build/libphobos/rt/lifetime.d:125
3  0x00040c4c in thread_attachThis ()
    at ../../../gcc-4.4.5-build/libphobos/core/thread.d:1813
4  0x00040da0 in thread_init ()
    at ../../../gcc-4.4.5-build/libphobos/core/thread.d:1792
5  0x000423ac in gc_init () at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:115
6  0x00045f50 in runAll ()
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:498
7  0x00045cec in tryExec (dg=...)
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:438
8  0x00045e84 in main (argc=<value optimized out>, argv=0xbe829944)
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:515

The value is actually correct in the gc_malloc function and it's passed on
correctly, but it's corrupted in the backtrace. Strange. Doesn't happen for the
unoptimized version. Also in the optimized version printf output is corrupted,
maybe this all somehow belongs together.
Full backtrace with -O0

0  _D2gc3gcx3Gcx9allocPageMFhZi (this=..., bin=3 '\003')
    at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:2179
        pn = 0
        pool = 0x6a0f0
        b = 0x6a094
        size = 128
        ptop = 0x40235000 ""
        p = 0x40234000 ""
        n = 0
1  0x00043414 in _D2gc3gcx2GC12mallocNoSyncMFkkPkZPv (this=..., size=88, 
    bits=1, alloc_size=0x0) at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:459
        collected = false
        state = 2
        bin = 3 '\003'
        p = 0x0
2  0x00043298 in _D2gc3gcx2GC6mallocMFkkPkZPv (this=..., size=88, bits=1, 
    alloc_size=0x0) at ../../../gcc-4.4.5-build/libphobos/gc/gcx.d:418
        __sync7 =  0x68c00
3  0x00042014 in gc_malloc (sz=88, ba=1)
    at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:188
No locals.
4  0x0003c1f8 in _d_newclass (ci=...)
    at ../../../gcc-4.4.5-build/libphobos/rt/lifetime.d:125
        p = <value optimized out>
5  0x00040c4c in thread_attachThis ()
    at ../../../gcc-4.4.5-build/libphobos/core/thread.d:1813
        thisContext = 0xbee3c618
        thisThread = <value optimized out>
6  0x00040da0 in thread_init ()
    at ../../../gcc-4.4.5-build/libphobos/core/thread.d:1792
        sigusr2 = {sa_handler = 0x3fde4 <thread_resumeHandler>, 
          sa_sigaction = 0x3fde4 <thread_resumeHandler>, sa_mask = {__val = {
              2147483647, 4294967294, 4294967295 <repeats 30 times>}}, 
          sa_flags = 0, sa_restorer = 0}
        sigusr1 = {sa_handler = 0x40b94 <thread_suspendHandler>, 
          sa_sigaction = 0x40b94 <thread_suspendHandler>, sa_mask = {__val = {
              2147483647, 4294967294, 4294967295 <repeats 30 times>}}, 
          sa_flags = 268435456, sa_restorer = 0}
7  0x00041c58 in gc_init () at ../../../gcc-4.4.5-build/libphobos/gc/gc.d:115
        ci =  0x68c50
        p = 0x6a018
8  0x0004991c in runAll ()
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:498
No locals.
9  0x000496b8 in tryExec (dg=...)
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:438
No locals.
10 0x00049850 in main (argc=<value optimized out>, argv=0xbee3c944)
    at ../../../gcc-4.4.5-build/libphobos/rt/dmain2.d:515
        args = {length = 1, ptr = 0x6a008}
        result = 0
        trapExceptions = true

Configure flags, native GCC

Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-6'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.4 --enable-shared --enable-multiarch
--enable-linker-build-id --with-system-zlib
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc
--disable-sjlj-exceptions --enable-checking=release --build=arm-linux-gnueabi
--host=arm-linux-gnueabi --target=arm-linux-gnueabi

Configure flags, cross GDC

Configured with: ../gcc-4.4.5-build/configure --prefix=/usr
--target=arm-linux-gnueabi --host=i686-pc-linux-gnu --build=i686-pc-linux-gnu
--enable-languages=d
--enable-threads --disable-nls --enable-shared --enable-multiarch
--disable-multilib --with-as=/usr/bin/arm-linux-gnueabi-as
--with-ld=/usr/bin/arm-linux-gnueabi-ld
--with-sysroot=/usr/i686-pc-linux-gnu/arm-linux-gnueabi

The board uses a Feroceon 88FR131 rev 1, google says that's armv5te, thumb mode
supported, no hardware floating point, so I should use

--with-arch=armv5te --with-float=softfp --with-mode=thumb

? I will try that later today.

Iain Buclaw - 2010-12-05
******************************************
Hmm, maybe it's all insignificant, but I can't recall why I knew that was there
now. Possibly at the time (I was kindly given an ssh account to login to for
testing) I may have seen that the call for: gc_malloc (sz=269656, ba=0) somehow
messed things up later on. I honestly can't remember am afraid. :)

It's certainly worth a try using using that. As it may have an affect on the
ASM generated by GCC.

Though I can see that at the very least, when I get my
C̶h̶r̶i̶s̶t̶m̶a̶s̶
̶p̶r̶e̶s̶e̶n̶t̶ Sheeva plug, will have to build phobos with -O0 and then
debug/slim-down a standalone version of gcx.d


Johannes Pfau - 2010-12-06
******************************************
Ok, I tried to build with those flags, but I hit this error.

/var/abs/local/cross-arm-linux-gnueabi/cross-arm-none-linux-gnueabi-gdc2-hg/src/gcc-build/./gcc/gdc
-B/var/abs/local/cross-arm-linux-gnueabi/cross-arm-none-linux-gnueabi-gdc2-hg/src/gcc-build/./gcc/
-B/usr/arm-linux-gnueabi/bin/ -B/usr/arm-linux-gnueabi/lib/ -isystem
/usr/arm-linux-gnueabi/include -isystem /usr/arm-linux-gnueabi/sys-include -o
std/math.o -Wall -g -frelease -O2 -fversion=GC_Use_Alloc_MMap
-fversion=GC_Use_Stack_GLibC -fversion=GC_Use_Data_Fixed -nostdinc -pipe
-fdeprecated -I ../../../gcc-4.4.5-build/libphobos -I ./arm-linux-gnueabi 
-femit-templates -c ../../../gcc-4.4.5-build/libphobos/std/math.d
{standard input}: Assembler messages:
{standard input}:383: Error: selected processor does not support `mrc
p10,7,r0,cr1,cr0,0'
{standard input}:470: Error: selected processor does not support `mcr
p10,7,r0,cr1,cr0,0'

Interesting, I didn't know gcc can detect this. Seems like software floating
point needs some special handling there. I just commented out the asm blocks
and the compiler built fine. However, the GC problem still remains. Well I'll
just compile without optimization until you get your sheevaplug and this can
hopefully be fixed.

Johannes Pfau - 2011-04-22
******************************************
Update:

With newest hg dmd and the patches from issue 193 and gcc 4.5.2 the infinite
loop is gone and instead a segmentation fault happens. The location where the
segfault occurs according to gdb is just as weird as the gc infinite loop bug.
Again this only occurs with -O >= 1, everything's fine with -O0.

Further investigation showed that the problem is -fsection-anchors. With
-fno-section-anchors even -O3 works fine. Interestingly -fsection-anchors is
enabled by default on ARM, but not on x86. Probably this is not even an ARM
specific bug. I'll recompile phobos with -fsection-anchors on x86 to see what
happens there.

(speculation:) I don't know much about compilers, machine code, etc. but
reading gccs description of -fsection-anchors it seems quite possible that this
optimization has caused all the trouble. The gc problem was caused by a
variable being read as 0 although the static data and everything was correct.
If this data was read from a wrong address that could explain the wrong value
and it also explains the segfaults I'm seeing with a recent gdc.

Johannes Pfau - 2011-04-22
******************************************
x86 doesn't support -fsection-anchors, even if explicitly enabled. So it could
still be a problem on all architectures which support -fsection-anchors but
I've got no hardware to test that.

Iain, would it be possible to "blacklist" -fsection-anchors? Or to add
"-fno-section-anchors" to the default arguments for gdc so that it would just
work on ARM?

Johannes Pfau - 2011-04-22
******************************************
* changed title to -fsection-anchors broken on ARM


Andrew Wiley - 2011-04-23
******************************************
From my understanding of ARM assembly, I would guess that they want
-fsection-anchors because ARM cannot use full size pointers as immediate values in instructions, which makes statically computed addresses hard to use. That said, I'm not sure how much of a performance hit -fno-section-anchors would cause. Iain Buclaw - 2011-04-23 ****************************************** Further investigation showed that the problem is -fsection-anchors. With -fno-section-anchors even -O3 works fine. Interestingly -fsection-anchors is enabled by default on ARM, but not on x86. Probably this is not even an ARM specific bug. I'll recompile phobos with -fsection-anchors on x86 to see what happens there. Right, turning off flag_section_anchors is the workaround. The next step would be to bastardise the gcx module into a simple program to showcase an example of what gets affected by -fsection-anchors (assuming it's just the one problem that propagates the issue). Johannes Pfau - 2011-09-27 ****************************************** I still wasn't able to reduce gcx.d, but I stepped through the asm code and found the following: The exact error changes a lot depending on the compiler flags and gdc version. In this case, findBin returned 0 instead of 3, which caused the memset in mallocNoSync to fail: memset(p + size, 0, binsize[bin] - size) evaluated to binsize[0] - 88 = 16-88 so it passed a negative size to memset. This is the correct address & values for binTable in the findBin function: (gdb) info address _D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g Symbol "_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g" is at 0x59110 in a file compiled without debugging. (gdb) x/88db 0x59110 0x59110 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g>: 0 0 0 0 0 0 0 0 0x59118 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+8>: 0 0 0 0 0 0 0 0 0x59120 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+16>: 0 1 1 1 1 1 1 1 0x59128 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+24>: 1 1 1 1 1 1 1 1 0x59130 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+32>: 1 2 2 2 2 2 2 2 0x59138 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+40>: 2 2 2 2 2 2 2 2 0x59140 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+48>: 2 2 2 2 2 2 2 2 0x59148 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+56>: 2 2 2 2 2 2 2 2 0x59150 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+64>: 2 3 3 3 3 3 3 3 0x59158 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+72>: 3 3 3 3 3 3 3 3 ---Type <return> to continue, or q <return> to quit--- 0x59160 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+80>: 3 3 3 3 3 3 3 3 And this is the findBin function: (gdb) disas _D2gc3gcx3Gcx7findBinFkZh Dump of assembler code for function _D2gc3gcx3Gcx7findBinFkZh: //if size <= 2048 load the word at <_D2gc3gcx3Gcx7findBinFkZh+20> into r3 <_D2gc3gcx3Gcx7findBinFkZh+20> //if not, move 8(B_PAGE) into r0(return value register) //Load value at address r3+r0 into r0; r0 = binTable[size] 0x0001b910 <+12>: ldrbls r0, [r3, r0] //return from function 0x0001b914 <+16>: bx lr //not an instruction, this is the base address of binTable 0x0001b918 <+20>: muleq r5, r0, r0 End of assembler dump. Let's have a look at the base address for binTable: (gdb) x/xw 0x0001b918 0x1b918 <_D2gc3gcx3Gcx7findBinFkZh+20>: 0x00059090 The address is 128 bytes off! And this is where the 0 returned by findBin comes from: (gdb) x/88db 0x00059090 0x59090: 0 0 0 0 0 0 0 0 0x59098: 0 0 0 0 0 0 0 0 0x590a0 <_D2gc3gcx10notbinsizeyG12k>: -16 -1 -1 -1 -32 -1 -1 -1 0x590a8 <_D2gc3gcx10notbinsizeyG12k+8>: -64 -1 -1 -1 -128 -1 -1 -1 0x590b0 <_D2gc3gcx10notbinsizeyG12k+16>: 0 -1 -1 -1 0-2 -1 -1 0x590b8 <_D2gc3gcx10notbinsizeyG12k+24>: 0 -4 -1 -1 0-8 -1 -1 0x590c0 <_D2gc3gcx10notbinsizeyG12k+32>: 0 -16 -1 -1 00 0 0 0x590c8 <_D2gc3gcx10notbinsizeyG12k+40>: 0 0 0 0 00 0 0 0x590d0: 0 0 0 0 0 0 0 0 0x590d8: 0 0 0 0 0 0 0 0 0x590e0: 0 0 0 0 0 0 0 0 Larger view of the memory at 0x00059090 (gdb) x/216db 0x00059090 0x59090: 0 0 0 0 0 0 0 0 0x59098: 0 0 0 0 0 0 0 0 0x590a0 <_D2gc3gcx10notbinsizeyG12k>: -16 -1 -1 -1 -32 -1 -1 -1 0x590a8 <_D2gc3gcx10notbinsizeyG12k+8>: -64 -1 -1 -1 -128 -1 -1 -1 0x590b0 <_D2gc3gcx10notbinsizeyG12k+16>: 0 -1 -1 -1 0 -2 -1 -1 0x590b8 <_D2gc3gcx10notbinsizeyG12k+24>: 0 -4 -1 -1 0 -8 -1 -1 0x590c0 <_D2gc3gcx10notbinsizeyG12k+32>: 0 -16 -1 -1 0 0 0 0 0x590c8 <_D2gc3gcx10notbinsizeyG12k+40>: 0 0 0 0 0 0 0 0 0x590d0: 0 0 0 0 0 0 0 0 0x590d8: 0 0 0 0 0 0 0 0 0x590e0: 0 0 0 0 0 0 0 0 0x590e8: 0 0 0 0 0 0 0 0 0x590f0: 0 0 0 0 0 0 0 0 0x590f8: 0 0 0 0 0 0 0 0 0x59100: 0 0 0 0 0 0 0 0 ---Type <return> to continue, or q <return> to quit--- 0x59108: 0 0 0 0 0 0 0 0 0x59110 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g>: 0 0 0 0 0 0 0 0 0x59118 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+8>: 0 0 0 0 0 0 0 0 0x59120 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+16>: 0 1 1 1 1 1 1 1 0x59128 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+24>: 1 1 1 1 1 1 1 1 0x59130 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+32>: 1 2 2 2 2 2 2 2 0x59138 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+40>: 2 2 2 2 2 2 2 2 0x59140 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+48>: 2 2 2 2 2 2 2 2 0x59148 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+56>: 2 2 2 2 2 2 2 2 0x59150 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+64>: 2 3 3 3 3 3 3 3 0x59158 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+72>: 3 3 3 3 3 3 3 3 ---Type <return> to continue, or q <return> to quit--- 0x59160 <_D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g+80>: 3 3 3 3 3 3 3 3 Let's also have a look how relocation affects this whole thing: objdump -x GDC/gdc/dev/gcc-4.6.1/objdir/armv7l-unknown-linux-gnueabi/libphobos/gc/gcx.o SYMBOL TABLE: 000000e0 g O .rodata 00000801 _D2gc3gcx3Gcx7findBinFkZh8binTablexG2049g RELOCATION RECORDS FOR [.text]: 00000c08 R_ARM_ABS32 .rodata and in gcx.o: Dump of assembler code for function _D2gc3gcx3Gcx7findBinFkZh: <_D2gc3gcx3Gcx7findBinFkZh+20> 0x00000c00 <+12>: ldrbls r0, [r3, r0] 0x00000c04 <+16>: bx lr 0x00000c08 <+20>: andeq r0, r0, r0, rrx End of assembler dump. (gdb) x/x 0x00000c08 0xc08 <_D2gc3gcx3Gcx7findBinFkZh+20>: 0x00000060 binTable is at offset 0x000000e0 --> 224 The address in the literal pool is 0x00000060 --> 96 So relocation doesn't change anything, the address is already off by 128 before relocation. Iain do you have a clue what could cause the wrong address? If not, I'll try to write a script for dustmite and let dustmite reduce that thing. Johannes Pfau - 2011-10-04 ****************************************** * assigned issue to Iain Buclaw OK, I think I tracked this thing down, but I'd be very grateful if someone else could write the gdc patch ;-) This is the problematic declaration in gcx.d: immutable uint binsize[B_MAX] = [ 16,32,64,128,256,512,1024,2048,4096 ]; What's special about this declaration? It's incomplete: B_MAX is 12, but the array literal only has 9 values. AFAICS gdc currently only passes a constructor with the 9 values to the backend and assumes that gcc adds the additional zeros. Long story short: it doesn't (correctly) and the zero padding should be added in the frontend. But what exactly happens in the backend? This code is from varasm.c function "output_constructor_regular_field": fieldpos = (tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (local->val)), 1) * ((tree_low_cst (local->index, 0) - tree_low_cst (local->min_index, 0)))); it calculates field positions for array members. tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (local->val)), 1) is the size of the elements, ((tree_low_cst (local->index, 0) is the index in the array. This works fine for the first 9 entries in our example. But, for some reason, the backend(or maybe the frontend, I'm not sure where the padding is added) added the 12 bytes final padding as one tree/element. This means, tree_low_cst (TYPE_SIZE_UNIT (TREE_TYPE (local->val)), 1) is 12 in this case and the calculation calculates 12*9=108 instead of 4*9=36. GCC detects that there's a gap between the last and the current offset and fills it with 72 zeros. AFAICS this should affect all architectures. What makes it catastrophic on ARM are section anchors: Section anchors refer to the arrays relative to the .rodata base address. But the section anchor code uses a different code path to calculate the offset, and ignores the zero padding added earlier. As a result, data is loaded from wrong offsets. (Although, technically, the offsets are correct, the .rodata is actually wrong) What needs to be done in the frontend then is this: For incomplete arrays, add the zero padding manually, so that the constructor is complete. The padding must be added as single elements of the array element size. It has to pass the same tree to the backend as [ 16,32,64,128,256,512,1024,2048,4096, 0, 0, 0 ] would. Iain Buclaw - 2011-10-05 ****************************************** * attached issue120.patch Patch should fix the issue. Can you test? Regards Johannes Pfau - 2011-10-05 ****************************************** Yep, I can confirm this patch is working. The .rodata segment is fine now and the crash is gone :-) However, there still is a problem with section anchors, we may have missed a case somewhere: ... _moduleinfo_array: 71:0x8ea0c _moduleinfo_array: 72:0x8e79c _moduleinfo_array: 73:0x4 <--------------------------------- _moduleinfo_array: 74:0x8e530 That moduleinfo belongs to core.runtime. If core-runtime is compiled with -fno-section-anchors, a hello world works. It's caused by incorrect relocations again. I think the issue is caused by this symbol: _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo7ClassZ The length of that entry in the .data section is set to 76, but for some reason 108 bytes are output for this symbol. _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo7__ClassZ: .word _D14TypeInfo_Class6__vtblZ .word 0 .word 4120 .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo6__initZ .word 49 .word .LC1 .word 8 .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo6__vtblZ .word 1 .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo7__ClassZ+76 .word _D6Object7__ClassZ .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo6__dtorMFZv .word 0 .word 60 .word 0 .word 0 .word 0 .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo6__ctorMFZC4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo .word 0 .word _D6object9Throwable9TraceInfo11__InterfaceZ .word 4 .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo7__ClassZ+92 .word 4116 .word _D4core7runtime19defaultTraceHandlerFPvZC6object9Throwable9TraceInfo16DefaultTraceInfo7__ClassZ+76 .word ___t.0.1431 .word ___t.1.1432 .word ___t.2.1433 It doesn't happen for other ClassZ though. Any idea what could cause this? -- Configure bugmail: http://bugzilla.gdcproject.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
Feb 01 2014