www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - assembler copy char[]

reply nobody <not possible.net> writes:
Hello,

i try copy mystring to st in assembler. 
Can someone give me an advice how to do this.

import std.stdio;
void main()
{
        char[] mystring = "Hey Assembler\n";
        char[] st;

        asm
        {
                mov EAX, dword ptr [mystring+4];
                mov st,EAX;
        }
        writefln("st: ", st);
}

I get a Segmentation fault .

Thanks
Jun 07 2007
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
nobody wrote:
 Hello,
 
 i try copy mystring to st in assembler. 
 Can someone give me an advice how to do this.
 
 import std.stdio;
 void main()
 {
         char[] mystring = "Hey Assembler\n";
         char[] st;
 
         asm
         {
                 mov EAX, dword ptr [mystring+4];
                 mov st,EAX;
         }
         writefln("st: ", st);
 }
 
 I get a Segmentation fault .
 
 Thanks

I have to wonder why on earth you're using assembly, and how you arrived at the above code. From what I can tell, you're copying mystring's length into st's *pointer* field, and then trying to print it. You say you want to "copy" the string, but you could do it just as easily like this: st = mystring; Why exactly are you doing this? I ask because I'm somewhat hesitant to hand a bazooka to someone who seems to be having trouble working out which end the rocket comes out of... -- Daniel
Jun 07 2007
prev sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
nobody wrote:
 i try copy mystring to st in assembler. 
 Can someone give me an advice how to do this.
 
 import std.stdio;
 void main()
 {
         char[] mystring = "Hey Assembler\n";
         char[] st;
 
         asm
         {
                 mov EAX, dword ptr [mystring+4];
                 mov st,EAX;
         }
         writefln("st: ", st);
 }
 
 I get a Segmentation fault .

Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;). Like Daniel, who posted while I was composing this post, I wonder why you're trying to do this. There's a much easier way that can likely be better optimized by the compiler. But if you're determined to do this (or are just trying to expand your knowledge of asm coding and/or D array implementation), read on. Assuming you're trying to code "st = mystring" in assembler, try this: --- import std.stdio; void main() { char[] mystring = "Hey Assembler\n"; char[] st; asm { // Copy length mov EAX, dword ptr [mystring]; mov [st],EAX; // Copy ptr mov EAX, dword ptr [mystring+4]; mov [st+4],EAX; } writefln("st: ", st); } --- Remember, dynamic arrays have two parts: a length and a pointer to the data. You need to copy both of them (and to the corresponding part of the destination, obviously) to copy a dynamic array reference. P.S. It'll likely be a bit more efficient to do this: --- asm { mov ECX, dword ptr [mystring]; mov EDX, dword ptr [mystring+4]; mov [st],ECX; mov [st+4],EDX; } --- because it leaves more time between reading and writing, allowing the CPU to perform better pipelining. About register usage: I used ECX & EDX here because (like EAX) those don't need to be preserved between function calls, so the compiler doesn't necessarily need to insert extra code to preserve them. I didn't use EAX because that's more likely to contain a useful value due to its special uses in calling conventions, and is thus more likely to require extra code to be emitted to preserve it. But better optimization would likely result from just changing it to "st = mystring" and adding '-O' (or '-O3' for GDC) to the command line options passed to the compiler :). (Plus it'll be platform-independent)
Jun 07 2007
parent reply nobody <not possible.net> writes:
Frits van Bommel Wrote:

 nobody wrote:
 i try copy mystring to st in assembler. 
 Can someone give me an advice how to do this.
 
 import std.stdio;
 void main()
 {
         char[] mystring = "Hey Assembler\n";
         char[] st;
 
         asm
         {
                 mov EAX, dword ptr [mystring+4];
                 mov st,EAX;
         }
         writefln("st: ", st);
 }
 
 I get a Segmentation fault .

Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;). Like Daniel, who posted while I was composing this post, I wonder why you're trying to do this. There's a much easier way that can likely be better optimized by the compiler. But if you're determined to do this (or are just trying to expand your knowledge of asm coding and/or D array implementation), read on.

Yes i try to learn Inline Assembler but it is more difficult then i thought ;-)
 
 Assuming you're trying to code "st = mystring" in assembler, try this:
 ---
 import std.stdio;
 void main()
 {
          char[] mystring = "Hey Assembler\n";
          char[] st;
 
          asm
          {
 		// Copy length
                  mov EAX, dword ptr [mystring];
                  mov [st],EAX;
 		
 		// Copy ptr
                  mov EAX, dword ptr [mystring+4];
                  mov [st+4],EAX;
          }
          writefln("st: ", st);
 }
 ---
 Remember, dynamic arrays have two parts: a length and a pointer to the 
 data.

Yes i figured this out. You need to copy both of them (and to the corresponding part of
 the destination, obviously) to copy a dynamic array reference.

Ah, that's the trick.
 
 
 
 P.S. It'll likely be a bit more efficient to do this:
 ---
          asm
          {
                  mov ECX, dword ptr [mystring];
                  mov EDX, dword ptr [mystring+4];
                  mov [st],ECX;
                  mov [st+4],EDX;
 		
          }
 ---
 because it leaves more time between reading and writing, allowing the 
 CPU to perform better pipelining.
 About register usage: I used ECX & EDX here because (like EAX) those 
 don't need to be preserved between function calls, so the compiler 
 doesn't necessarily need to insert extra code to preserve them. I didn't 
 use EAX because that's more likely to contain a useful value due to its 
 special uses in calling conventions, and is thus more likely to require 
 extra code to be emitted to preserve it.
 
 But better optimization would likely result from just changing it to "st 
 = mystring" and adding '-O' (or '-O3' for GDC) to the command line 
 options passed to the compiler :). (Plus it'll be platform-independent)

Wow, that's exactly what i want. In dmd it's all ok, but the gdc didn't like it: mov EDX, dword ptr [mystring+4]; mov [st+4],EDX; gdmd string.d /tmp/cc4h5AKo.s: Assembler messages: /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression
Jun 07 2007
next sibling parent Don Clugston <dac nospam.com.au> writes:
nobody wrote:
 Frits van Bommel Wrote:
 
 nobody wrote:
 i try copy mystring to st in assembler. 
 Can someone give me an advice how to do this.

 import std.stdio;
 void main()
 {
         char[] mystring = "Hey Assembler\n";
         char[] st;

         asm
         {
                 mov EAX, dword ptr [mystring+4];
                 mov st,EAX;
         }
         writefln("st: ", st);
 }

 I get a Segmentation fault .

an array reference to a (likely huge) array with (.ptr == null) ;). Like Daniel, who posted while I was composing this post, I wonder why you're trying to do this. There's a much easier way that can likely be better optimized by the compiler. But if you're determined to do this (or are just trying to expand your knowledge of asm coding and/or D array implementation), read on.

Yes i try to learn Inline Assembler but it is more difficult then i thought ;-)
 Assuming you're trying to code "st = mystring" in assembler, try this:
 ---
 import std.stdio;
 void main()
 {
          char[] mystring = "Hey Assembler\n";
          char[] st;

          asm
          {
 		// Copy length
                  mov EAX, dword ptr [mystring];
                  mov [st],EAX;
 		
 		// Copy ptr
                  mov EAX, dword ptr [mystring+4];
                  mov [st+4],EAX;
          }
          writefln("st: ", st);
 }
 ---
 Remember, dynamic arrays have two parts: a length and a pointer to the 
 data.

Yes i figured this out. You need to copy both of them (and to the corresponding part of
 the destination, obviously) to copy a dynamic array reference.

Ah, that's the trick.
 P.S. It'll likely be a bit more efficient to do this:
 ---
          asm
          {
                  mov ECX, dword ptr [mystring];
                  mov EDX, dword ptr [mystring+4];
                  mov [st],ECX;
                  mov [st+4],EDX;
 		
          }
 ---
 because it leaves more time between reading and writing, allowing the 
 CPU to perform better pipelining.
 About register usage: I used ECX & EDX here because (like EAX) those 
 don't need to be preserved between function calls, so the compiler 
 doesn't necessarily need to insert extra code to preserve them. I didn't 
 use EAX because that's more likely to contain a useful value due to its 
 special uses in calling conventions, and is thus more likely to require 
 extra code to be emitted to preserve it.

 But better optimization would likely result from just changing it to "st 
 = mystring" and adding '-O' (or '-O3' for GDC) to the command line 
 options passed to the compiler :). (Plus it'll be platform-independent)

Wow, that's exactly what i want. In dmd it's all ok, but the gdc didn't like it: mov EDX, dword ptr [mystring+4]; mov [st+4],EDX; gdmd string.d /tmp/cc4h5AKo.s: Assembler messages: /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression

uppercase; maybe not true for GDC.
Jun 07 2007
prev sibling next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
nobody wrote:
 Wow, that's exactly what i want.
 In dmd it's all ok, but the gdc  didn't like it: 
 
 mov EDX, dword ptr [mystring+4];
 mov [st+4],EDX;
 
 
 gdmd  string.d
 /tmp/cc4h5AKo.s: Assembler messages:
 /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
 /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression

IIRC, that's because local variables are actually EBP plus some offset. I think DMD is inlining the two offsets, but GDC isn't. If you want to learn assembler, the best way is to just write code in D, compile it, and then disassemble it. I assume you're running under Linux; gdb should have an option to disassemble the current function. That way, you can read the original line of source code, and what the compiler actually produces. Here's what ddbg gives me for the program void main() { auto mystring = "Hello, World!"; auto st = mystring; } Disassembly: copy_string.d:2 void main() 00402010: c8200000 enter 0x20, 0x0 00402014: 53 push ebx copy_string.d:4 auto mystring = "Hello, World!"; 00402015: 8d45e0 lea eax, [ebp-0x20] 00402018: 50 push eax 00402019: 6a0d push 0xd 0040201b: ff3594f04000 push dword [0x40f094] 00402021: ff3590f04000 push dword [0x40f090] 00402027: 6a01 push 0x1 00402029: e87e010000 call 0x4021ac __d_arraycopy copy_string.d:5 auto st = mystring; 0040202e: 8d4df0 lea ecx, [ebp-0x10] 00402031: 51 push ecx 00402032: 6a0d push 0xd 00402034: 8d55e0 lea edx, [ebp-0x20] 00402037: bb0d000000 mov ebx, 0xd 0040203c: 52 push edx 0040203d: 53 push ebx 0040203e: 6a01 push 0x1 00402040: e867010000 call 0x4021ac __d_arraycopy 00402045: 31c0 xor eax, eax 00402047: 83c428 add esp, 0x28 copy_string.obj 0040204a: 5b pop ebx 0040204b: c9 leave 0040204c: c3 ret -- Daniel
Jun 07 2007
parent reply nobody <not possible.net> writes:
Daniel Keep Wrote:

 
 
 nobody wrote:
 Wow, that's exactly what i want.
 In dmd it's all ok, but the gdc  didn't like it: 
 
 mov EDX, dword ptr [mystring+4];
 mov [st+4],EDX;
 
 
 gdmd  string.d
 /tmp/cc4h5AKo.s: Assembler messages:
 /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
 /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression

IIRC, that's because local variables are actually EBP plus some offset. I think DMD is inlining the two offsets, but GDC isn't. If you want to learn assembler, the best way is to just write code in D, compile it, and then disassemble it. I assume you're running under Linux; gdb should have an option to disassemble the current function. That way, you can read the original line of source code, and what the compiler actually produces. Here's what ddbg gives me for the program void main() { auto mystring = "Hello, World!"; auto st = mystring; } Disassembly: copy_string.d:2 void main() 00402010: c8200000 enter 0x20, 0x0 00402014: 53 push ebx copy_string.d:4 auto mystring = "Hello, World!"; 00402015: 8d45e0 lea eax, [ebp-0x20] 00402018: 50 push eax 00402019: 6a0d push 0xd 0040201b: ff3594f04000 push dword [0x40f094] 00402021: ff3590f04000 push dword [0x40f090] 00402027: 6a01 push 0x1 00402029: e87e010000 call 0x4021ac __d_arraycopy copy_string.d:5 auto st = mystring; 0040202e: 8d4df0 lea ecx, [ebp-0x10] 00402031: 51 push ecx 00402032: 6a0d push 0xd 00402034: 8d55e0 lea edx, [ebp-0x20] 00402037: bb0d000000 mov ebx, 0xd 0040203c: 52 push edx 0040203d: 53 push ebx 0040203e: 6a01 push 0x1 00402040: e867010000 call 0x4021ac __d_arraycopy 00402045: 31c0 xor eax, eax 00402047: 83c428 add esp, 0x28 copy_string.obj 0040204a: 5b pop ebx 0040204b: c9 leave 0040204c: c3 ret -- Daniel

With gdb Dump of assembler code for function main: 0x08049a70 <main+0>: lea 0x4(%esp),%ecx 0x08049a74 <main+4>: and $0xfffffff0,%esp 0x08049a77 <main+7>: pushl 0xfffffffc(%ecx) 0x08049a7a <main+10>: push %ebp 0x08049a7b <main+11>: mov %esp,%ebp 0x08049a7d <main+13>: push %ecx 0x08049a7e <main+14>: sub $0x14,%esp 0x08049a81 <main+17>: mov (%ecx),%edx 0x08049a83 <main+19>: mov 0x4(%ecx),%eax 0x08049a86 <main+22>: mov $0x8049154,%ecx 0x08049a8b <main+27>: mov %ecx,0x8(%esp) 0x08049a8f <main+31>: mov %edx,(%esp) 0x08049a92 <main+34>: mov %eax,0x4(%esp) 0x08049a96 <main+38>: call 0x8049af0 <_d_run_main> 0x08049a9b <main+43>: add $0x14,%esp 0x08049a9e <main+46>: pop %ecx 0x08049a9f <main+47>: pop %ebp Ok, i try it tomorrow again, with ESP . I have thought that's only needed by naked asm blocks?
Jun 07 2007
parent nobody <not possible.net> writes:
nobody Wrote:

 Daniel Keep Wrote:
 
 
 
 nobody wrote:
 Wow, that's exactly what i want.
 In dmd it's all ok, but the gdc  didn't like it: 
 
 mov EDX, dword ptr [mystring+4];
 mov [st+4],EDX;
 
 
 gdmd  string.d
 /tmp/cc4h5AKo.s: Assembler messages:
 /tmp/cc4h5AKo.s:30: Error: junk `(%ebp)+4' after expression
 /tmp/cc4h5AKo.s:31: Error: junk `(%ebp)+4' after expression

IIRC, that's because local variables are actually EBP plus some offset. I think DMD is inlining the two offsets, but GDC isn't. If you want to learn assembler, the best way is to just write code in D, compile it, and then disassemble it. I assume you're running under Linux; gdb should have an option to disassemble the current function. That way, you can read the original line of source code, and what the compiler actually produces. Here's what ddbg gives me for the program void main() { auto mystring = "Hello, World!"; auto st = mystring; } Disassembly: copy_string.d:2 void main() 00402010: c8200000 enter 0x20, 0x0 00402014: 53 push ebx copy_string.d:4 auto mystring = "Hello, World!"; 00402015: 8d45e0 lea eax, [ebp-0x20] 00402018: 50 push eax 00402019: 6a0d push 0xd 0040201b: ff3594f04000 push dword [0x40f094] 00402021: ff3590f04000 push dword [0x40f090] 00402027: 6a01 push 0x1 00402029: e87e010000 call 0x4021ac __d_arraycopy copy_string.d:5 auto st = mystring; 0040202e: 8d4df0 lea ecx, [ebp-0x10] 00402031: 51 push ecx 00402032: 6a0d push 0xd 00402034: 8d55e0 lea edx, [ebp-0x20] 00402037: bb0d000000 mov ebx, 0xd 0040203c: 52 push edx 0040203d: 53 push ebx 0040203e: 6a01 push 0x1 00402040: e867010000 call 0x4021ac __d_arraycopy 00402045: 31c0 xor eax, eax 00402047: 83c428 add esp, 0x28 copy_string.obj 0040204a: 5b pop ebx 0040204b: c9 leave 0040204c: c3 ret -- Daniel

With gdb Dump of assembler code for function main: 0x08049a70 <main+0>: lea 0x4(%esp),%ecx 0x08049a74 <main+4>: and $0xfffffff0,%esp 0x08049a77 <main+7>: pushl 0xfffffffc(%ecx) 0x08049a7a <main+10>: push %ebp 0x08049a7b <main+11>: mov %esp,%ebp 0x08049a7d <main+13>: push %ecx 0x08049a7e <main+14>: sub $0x14,%esp 0x08049a81 <main+17>: mov (%ecx),%edx 0x08049a83 <main+19>: mov 0x4(%ecx),%eax 0x08049a86 <main+22>: mov $0x8049154,%ecx 0x08049a8b <main+27>: mov %ecx,0x8(%esp) 0x08049a8f <main+31>: mov %edx,(%esp) 0x08049a92 <main+34>: mov %eax,0x4(%esp) 0x08049a96 <main+38>: call 0x8049af0 <_d_run_main> 0x08049a9b <main+43>: add $0x14,%esp 0x08049a9e <main+46>: pop %ecx 0x08049a9f <main+47>: pop %ebp Ok, i try it tomorrow again, with ESP . I have thought that's only needed by naked asm blocks?

Here my new versions: With ESP import tango.io.Stdout; void main() { char[] mystring = "Hey Assembler"; char[] str; asm { // Load Adress lea ECX,dword ptr [str] ; // Copy length mov EDX, dword ptr [ESP] ; mov [ECX],EDX ; // Copy ptr mov EDX, dword ptr [ESP+4] ; mov [ECX+4],EDX ; } Stdout("str ")(str).newline; } With lea: import tango.io.Stdout; void main() { char[] mystring = "Hey Assembler"; char[] str; asm { // Load Adress lea ECX,dword ptr [str] ; lea EBX,dword ptr [mystring] ; // Copy length mov EDX, dword ptr [EBX] ; mov [ECX],EDX ; // Copy ptr mov EDX, dword ptr [EBX+4] ; mov [ECX+4],EDX ; } Stdout("str ")(str).newline; }
Jun 08 2007
prev sibling parent Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

nobody schrieb am 2007-06-07:
 Frits van Bommel Wrote:

 nobody wrote:
 i try copy mystring to st in assembler. 
 Can someone give me an advice how to do this.
 
 import std.stdio;
 void main()
 {
         char[] mystring = "Hey Assembler\n";
         char[] st;
 
         asm
         {
                 mov EAX, dword ptr [mystring+4];
                 mov st,EAX;
         }
         writefln("st: ", st);
 }
 
 I get a Segmentation fault .

Unsurprising, since you're copying mystring.ptr to st.length, creating an array reference to a (likely huge) array with (.ptr == null) ;). Like Daniel, who posted while I was composing this post, I wonder why you're trying to do this. There's a much easier way that can likely be better optimized by the compiler. But if you're determined to do this (or are just trying to expand your knowledge of asm coding and/or D array implementation), read on.

Yes i try to learn Inline Assembler but it is more difficult then i thought ;-)

If you like to learn inline assembler the hard way have a look at the asm_*d files at http://dstress.kuehne.cn/run/a . Some of those require the helper http://dstress.kuehne.cn/addon/cpuinfo.d . Please don't read cpuinfo.d because it uses various hacks to get around some GDC bugs. Thomas -----BEGIN PGP SIGNATURE----- iQIVAwUBRmgcs7ZlboUnBhRKAQIWOQ//e/NKJudP318iTZbFArlXub3Ppl0b/fwk qPbu5TfnQ1FX0z+sY1jWxiuTmtt/k3qg/RcPViqI8XWt58xElXH2DB9/eM+fD56i ChLpHfExMkq276GTx1MeWfGTDYB07rJQJ6gnvRF0QNWg8Iiaw+tRYBzuyCSxl8nJ IwV/21vVA80iDjCPMzboC+2oR30Esx0HKWwB8n1wNQR4u6RlPw0zclV3/O+R7uZv PYZP6qMFNV63Y019zRPi2VpdpTfLJuIp9/qlyGijErus1VfeXIqBmzFztAxx4948 EnpGC/L3fVX13kdvdYI15nXEBXzmypP2yLrTcRUb248GkknQLSxORVl2JwLo4Ljc BUxFMGpHept8wDTthYPkyOQx5k5nRERduyq3xsq2ki+7Lj5gBt79mUuzC95xF2FB nH2vpJ9QZDnaTlnhH+RmOR/Q7WlvgS5e2N2rjZw3bR9YspgjtJFHwox5XtMdPau8 mL1+lHHtuqo+/R+XxOBm51CbW/PKXisuAEbMRkT4lWOAcW3UAxWS0OoNVzsE0hFi vCom5sX9AgPgeTPf2cM5nVAe0s24D4QkNvLV5NNsSk+KPFNzb02Gyxz5QxkFU9kF mAFSQIkHGTOvi93lJ8LGyjDpK4ORLETBS9hTgagFoR5UHGBhCiNhusunzrZxlrIi VygfeM0kBCs= =WwyP -----END PGP SIGNATURE-----
Jun 07 2007