www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - dmd asm output

reply "John Colvin" <john.loughran.colvin gmail.com> writes:
I've been learning assembler a bit and I decided to have a look 
at what dmd spits out. I tried a simple function with arrays to 
see what vectorization gets done

void addto(int[] a, int[] b) {
     a[] += b[];
}

dmd -O -release -inline -noboundscheck -gc -c test.d

disassembled with gdb:
_D3sse5addtoFAiAiZv:
0x0000000000000040 <+0>:      push   rbp
0x0000000000000041 <+1>:      mov    rbp,rsp
0x0000000000000044 <+4>:      sub    rsp,0x30
0x0000000000000048 <+8>:      mov    QWORD PTR [rbp-0x20],rdi
0x000000000000004c <+12>:    mov    QWORD PTR [rbp-0x18],rsi
0x0000000000000050 <+16>:    mov    QWORD PTR [rbp-0x10],rdx
0x0000000000000054 <+20>:    mov    QWORD PTR [rbp-0x8],rcx
0x0000000000000058 <+24>:    mov    rcx,QWORD PTR [rbp-0x18]
0x000000000000005c <+28>:    mov    rax,QWORD PTR [rbp-0x20]
0x0000000000000060 <+32>:    mov    rdx,rax
0x0000000000000063 <+35>:    mov    QWORD PTR [rbp-0x28],rdx
0x0000000000000067 <+39>:    mov    rdx,QWORD PTR [rbp-0x8]
0x000000000000006b <+43>:    mov    rdi,QWORD PTR [rbp-0x10]
0x000000000000006f <+47>:     mov    rsi,rdx
0x0000000000000072 <+50>:    mov    rdx,QWORD PTR [rbp-0x28]
0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
0x000000000000007b <+59>:    mov    rsp,rbp
0x000000000000007e <+62>:    pop    rbp
0x000000000000007f <+63>:     ret

This looks nothing like what I expected. At first I thought maybe 
it was due to a crazy calling convention, but adding extern(C) 
changed nothing.

Can anyone explain what on earth is going on here? All that 
moving things on and off the stack, a call to the next line 
(strange) and then we're done bar the cleanup?  I feel i must be 
missing something.
Mar 31 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
John Colvin:

 Can anyone explain what on earth is going on here?
In the dmd sources there are the sources for those array operations too. In what you are seeing I think something is not recognizing the SSE+ instructions. Bye, bearophile
Mar 31 2013
parent reply "bearophile" <bearophileHUGS lycos.com> writes:
 In what you are seeing I think something is not recognizing the 
 SSE+ instructions.
Sorry, I was wrong. The SSE ops are done elsewhere. You see that "call 0x7b <_D3sse5addtoFAiAiZv+59>". Bye, bearophile
Mar 31 2013
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 1 April 2013 at 02:03:12 UTC, bearophile wrote:
 In what you are seeing I think something is not recognizing 
 the SSE+ instructions.
Sorry, I was wrong. The SSE ops are done elsewhere. You see that "call 0x7b <_D3sse5addtoFAiAiZv+59>". Bye, bearophile
Woops, sorry the actual filename I used was sse.d You can see that in the function name at the top, the same name as in the call
Mar 31 2013
prev sibling next sibling parent "nazriel" <spam dzfl.pl> writes:
On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
 I've been learning assembler a bit and I decided to have a look 
 at what dmd spits out. I tried a simple function with arrays to 
 see what vectorization gets done

 void addto(int[] a, int[] b) {
     a[] += b[];
 }

 dmd -O -release -inline -noboundscheck -gc -c test.d

 disassembled with gdb:
 _D3sse5addtoFAiAiZv:
 0x0000000000000040 <+0>:      push   rbp
 0x0000000000000041 <+1>:      mov    rbp,rsp
 0x0000000000000044 <+4>:      sub    rsp,0x30
 0x0000000000000048 <+8>:      mov    QWORD PTR [rbp-0x20],rdi
 0x000000000000004c <+12>:    mov    QWORD PTR [rbp-0x18],rsi
 0x0000000000000050 <+16>:    mov    QWORD PTR [rbp-0x10],rdx
 0x0000000000000054 <+20>:    mov    QWORD PTR [rbp-0x8],rcx
 0x0000000000000058 <+24>:    mov    rcx,QWORD PTR [rbp-0x18]
 0x000000000000005c <+28>:    mov    rax,QWORD PTR [rbp-0x20]
 0x0000000000000060 <+32>:    mov    rdx,rax
 0x0000000000000063 <+35>:    mov    QWORD PTR [rbp-0x28],rdx
 0x0000000000000067 <+39>:    mov    rdx,QWORD PTR [rbp-0x8]
 0x000000000000006b <+43>:    mov    rdi,QWORD PTR [rbp-0x10]
 0x000000000000006f <+47>:     mov    rsi,rdx
 0x0000000000000072 <+50>:    mov    rdx,QWORD PTR [rbp-0x28]
 0x0000000000000076 <+54>:    call   0x7b 
 <_D3sse5addtoFAiAiZv+59>
 0x000000000000007b <+59>:    mov    rsp,rbp
 0x000000000000007e <+62>:    pop    rbp
 0x000000000000007f <+63>:     ret

 This looks nothing like what I expected. At first I thought 
 maybe it was due to a crazy calling convention, but adding 
 extern(C) changed nothing.

 Can anyone explain what on earth is going on here? All that 
 moving things on and off the stack, a call to the next line 
 (strange) and then we're done bar the cleanup?  I feel i must 
 be missing something.
It just looks like wrong snippet. Probably GDB isn't best assembly level debugger. .text._D4test5addtoFAiAiZAi:08000044 public _D4test5addtoFAiAiZAi .text._D4test5addtoFAiAiZAi:08000044 _D4test5addtoFAiAiZAi proc near .text._D4test5addtoFAiAiZAi:08000044 .text._D4test5addtoFAiAiZAi:08000044 arg_0 = dword ptr 8 .text._D4test5addtoFAiAiZAi:08000044 arg_8 = dword ptr 10h .text._D4test5addtoFAiAiZAi:08000044 arg_C = dword ptr 14h .text._D4test5addtoFAiAiZAi:08000044 .text._D4test5addtoFAiAiZAi:08000044 push ebp .text._D4test5addtoFAiAiZAi:08000045 mov ebp, esp .text._D4test5addtoFAiAiZAi:08000047 push dword ptr [esp+0Ch] .text._D4test5addtoFAiAiZAi:0800004B push [ebp+arg_0] .text._D4test5addtoFAiAiZAi:0800004E push [ebp+arg_C] .text._D4test5addtoFAiAiZAi:08000051 push [ebp+arg_8] .text._D4test5addtoFAiAiZAi:08000054 call _arraySliceSliceAddass_i .text._D4test5addtoFAiAiZAi:08000059 add esp, 10h .text._D4test5addtoFAiAiZAi:0800005C pop ebp .text._D4test5addtoFAiAiZAi:0800005D retn 10h .text._D4test5addtoFAiAiZAi:0800005D _D4test5addtoFAiAiZAi endp Pardon 32bits, my IDA free doesn't handle 64bit too well. The only difference is the fact that arguments here are passed on stack instead of rdi, rsi etc like it takes place on System V AMD64 calling convention
Apr 01 2013
prev sibling parent reply "js.mdnq" <js_adddot+mdng gmail.com> writes:
On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
 I've been learning assembler a bit and I decided to have a look 
 at what dmd spits out. I tried a simple function with arrays to 
 see what vectorization gets done

 void addto(int[] a, int[] b) {
     a[] += b[];
 }

 dmd -O -release -inline -noboundscheck -gc -c test.d

 disassembled with gdb:
 _D3sse5addtoFAiAiZv:
 0x0000000000000040 <+0>:      push   rbp
 0x0000000000000041 <+1>:      mov    rbp,rsp
 0x0000000000000044 <+4>:      sub    rsp,0x30
 0x0000000000000048 <+8>:      mov    QWORD PTR [rbp-0x20],rdi
 0x000000000000004c <+12>:    mov    QWORD PTR [rbp-0x18],rsi
 0x0000000000000050 <+16>:    mov    QWORD PTR [rbp-0x10],rdx
 0x0000000000000054 <+20>:    mov    QWORD PTR [rbp-0x8],rcx
 0x0000000000000058 <+24>:    mov    rcx,QWORD PTR [rbp-0x18]
 0x000000000000005c <+28>:    mov    rax,QWORD PTR [rbp-0x20]
 0x0000000000000060 <+32>:    mov    rdx,rax
 0x0000000000000063 <+35>:    mov    QWORD PTR [rbp-0x28],rdx
 0x0000000000000067 <+39>:    mov    rdx,QWORD PTR [rbp-0x8]
 0x000000000000006b <+43>:    mov    rdi,QWORD PTR [rbp-0x10]
 0x000000000000006f <+47>:     mov    rsi,rdx
 0x0000000000000072 <+50>:    mov    rdx,QWORD PTR [rbp-0x28]
 0x0000000000000076 <+54>:    call   0x7b 
 <_D3sse5addtoFAiAiZv+59>
 0x000000000000007b <+59>:    mov    rsp,rbp
 0x000000000000007e <+62>:    pop    rbp
 0x000000000000007f <+63>:     ret

 This looks nothing like what I expected. At first I thought 
 maybe it was due to a crazy calling convention, but adding 
 extern(C) changed nothing.

 Can anyone explain what on earth is going on here? All that 
 moving things on and off the stack, a call to the next line 
 (strange) and then we're done bar the cleanup?  I feel i must 
 be missing something.
What's after the code? The 0x76 call is an inline call function, the ret returns it. The stuff before it is setting up the registers for the call and what comes after
 0x0000000000000076 <+54>:    call   0x7b 
 <_D3sse5addtoFAiAiZv+59>
 0x000000000000007b <+59>:    mov    rsp,rbp
 0x000000000000007e <+62>:    pop    rbp
 0x000000000000007f <+63>:    ret
As you can see, the call is calling the function right below it, but when it returns it depends on what is on the stack as to where the function returns(since ip is being popped into rbp). To me, and this is a guess, this looks like some type of table of functions being called(the ret function is being redirected to somewhere other than to the place that it was being called from). So there is much more going on than meets the eye. It would be easier to understand if you stepped through the code to see where the ret is headed.
Apr 01 2013
parent reply Artur Skawina <art.08.09 gmail.com> writes:
On 04/01/13 12:24, js.mdnq wrote:
 On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
 What's after the code?
 
 The 0x76 call is an inline call function, the ret returns it. The stuff before
it is setting up the registers for the call and what comes after
 
 0x0000000000000076 <+54>:    call   0x7b <_D3sse5addtoFAiAiZv+59>
 0x000000000000007b <+59>:    mov    rsp,rbp
 0x000000000000007e <+62>:    pop    rbp
 0x000000000000007f <+63>:    ret
As you can see, the call is calling the function right below it, [...]
This is just how objdump/gdb shows the code - it does *not* display relocations inline, so you get this misleading output. The call instruction will not end up having a zero offset (that is why it seems to point at the next op), but will be fixed up to call the right function. Run objdump -dr your_obj_or_exe_file and the real call target will be shown as a relocation entry after the call instruction. artur
Apr 01 2013
parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Monday, 1 April 2013 at 11:10:56 UTC, Artur Skawina wrote:
 On 04/01/13 12:24, js.mdnq wrote:
 On Monday, 1 April 2013 at 01:54:10 UTC, John Colvin wrote:
 What's after the code?
 
 The 0x76 call is an inline call function, the ret returns it. 
 The stuff before it is setting up the registers for the call 
 and what comes after
 
 0x0000000000000076 <+54>:    call   0x7b 
 <_D3sse5addtoFAiAiZv+59>
 0x000000000000007b <+59>:    mov    rsp,rbp
 0x000000000000007e <+62>:    pop    rbp
 0x000000000000007f <+63>:    ret
As you can see, the call is calling the function right below it, [...]
This is just how objdump/gdb shows the code - it does *not* display relocations inline, so you get this misleading output. The call instruction will not end up having a zero offset (that is why it seems to point at the next op), but will be fixed up to call the right function. Run objdump -dr your_obj_or_exe_file and the real call target will be shown as a relocation entry after the call instruction. artur
thanks, that explains it.
Apr 01 2013