c++.dos.32-bits - X32 bug???
- "Laurentiu Pancescu" <lpancescu fastmail.fm> Jan 31 2002
- "Walter" <walter digitalmars.com> Jan 31 2002
- "Laurentiu Pancescu" <plaur crosswinds.net> Jan 31 2002
- Jan Knepper <jan smartsoft.cc> Feb 01 2002
- "Laurentiu Pancescu" <plaur crosswinds.net> Feb 01 2002
- "Walter" <walter digitalmars.com> Feb 01 2002
- "Laurentiu Pancescu" <lpancescu fastmail.fm> Feb 02 2002
- "Walter" <walter digitalmars.com> Feb 02 2002
- "Walter" <walter digitalmars.com> Feb 02 2002
- "Laurentiu Pancescu" <lpancescu fastmail.fm> Feb 03 2002
- "Walter" <walter digitalmars.com> Feb 03 2002
- "Laurentiu Pancescu" <lpancescu fastmail.fm> Feb 04 2002
- "Walter" <walter digitalmars.com> Feb 04 2002
- Heinz Saathoff <hsaat bre.ipnet.de> Feb 05 2002
- "Laurentiu Pancescu" <lpancescu fastmail.fm> Feb 05 2002
I'm using NASM to assemble an external function, then I link the OBJ file
with the rest of DMC compiled files, but the EXE crashes (GPF, 0DH) at run.
Only -mx is affected by this, -mn works fine, also tests made with Borland
C++ 5.5.1, Cygwin, MinGW and DJGPP (same assembly code, NASM can generate
all suitable formats).
Here's an example:
; test.asm
; use "nasm -f obj test.asm -o test.obj
segment code public use32
global _get_value
_get_value:
push ebp
mov ebp, esp
mov eax, [ebp + 8]
add eax, eax
leave
retn
/* main.c */
/* sc -mx main.c test.obj x32.lib */
#include <stdio.h>
unsigned get_value(unsigned);
int main(void)
{
printf("Result is %u\n", get_value(9));
return 0;
}
On all other configurations, the displayed value is 18, as expected. I
looked in the DMC generated code, and everything is okay (duh!). Maybe this
is a problem with the DOS extender? Did anyone else encounter this problem?
If you don't use NASM, I think my asm example should be straightforward to
convert to TASM or MASM syntax.
My environment is Win2k SP2, DMC 8.26, X32 from May 15th (latest version,
you know what I'm talking about). Any feedback would be appreciated -
thanks!
Is there a debugger for DOSX programs? WUDEBUG only debugs WDOSX
programs... :(
Laurentiu
Jan 31 2002
Try a simple hello world program with -mx and verify that works on your system. -Walter "Laurentiu Pancescu" <lpancescu fastmail.fm> wrote in message news:a3c0li$2h0u$1 digitaldaemon.com...I'm using NASM to assemble an external function, then I link the OBJ file with the rest of DMC compiled files, but the EXE crashes (GPF, 0DH) at
Only -mx is affected by this, -mn works fine, also tests made with Borland C++ 5.5.1, Cygwin, MinGW and DJGPP (same assembly code, NASM can generate all suitable formats). Here's an example: ; test.asm ; use "nasm -f obj test.asm -o test.obj segment code public use32 global _get_value _get_value: push ebp mov ebp, esp mov eax, [ebp + 8] add eax, eax leave retn /* main.c */ /* sc -mx main.c test.obj x32.lib */ #include <stdio.h> unsigned get_value(unsigned); int main(void) { printf("Result is %u\n", get_value(9)); return 0; } On all other configurations, the displayed value is 18, as expected. I looked in the DMC generated code, and everything is okay (duh!). Maybe
is a problem with the DOS extender? Did anyone else encounter this
If you don't use NASM, I think my asm example should be straightforward to convert to TASM or MASM syntax. My environment is Win2k SP2, DMC 8.26, X32 from May 15th (latest version, you know what I'm talking about). Any feedback would be appreciated - thanks! Is there a debugger for DOSX programs? WUDEBUG only debugs WDOSX programs... :( Laurentiu
Jan 31 2002
"Walter" <walter digitalmars.com> wrote in message news:a3csdi$4ac$1 digitaldaemon.com...Try a simple hello world program with -mx and verify that works on your system. -Walter
It does, and even pretty large programs, both in C and C++. It's no problem when it's only high-level source code. Problems arise when I try to use externally defined functions (I use NASM for portability reasons, and because of its cleaner syntax: I can use the same ASM file for any 32bit compiler, and virtually any operating system! It's very suitable for portable MMX or 3dnow! optimizations). It works with DMC in Win32 mode, so I think the problem is in X32, not in my code. I "heard" about some stack alignment problems when X32 runs under real-mode DOS: maybe they're not the only ones? Do you think I should contact Mr. Doug Hoffman about this issue? Laurentiu
Jan 31 2002
Check http://www.dosextender.com/ I think Doug Huffman put out a new version... Laurentiu Pancescu wrote:"Walter" <walter digitalmars.com> wrote in message news:a3csdi$4ac$1 digitaldaemon.com...Try a simple hello world program with -mx and verify that works on your system. -Walter
It does, and even pretty large programs, both in C and C++. It's no problem when it's only high-level source code. Problems arise when I try to use externally defined functions (I use NASM for portability reasons, and because of its cleaner syntax: I can use the same ASM file for any 32bit compiler, and virtually any operating system! It's very suitable for portable MMX or 3dnow! optimizations). It works with DMC in Win32 mode, so I think the problem is in X32, not in my code. I "heard" about some stack alignment problems when X32 runs under real-mode DOS: maybe they're not the only ones? Do you think I should contact Mr. Doug Hoffman about this issue? Laurentiu
Feb 01 2002
I used the latest version, when I saw those problems... it's downloaded 2 days ago, but with the same result! It may be related to NTVDM bugs, I don't know... I'll try to boot with a DOS disk, and see if it still crashes. Laurentiu "Jan Knepper" <jan smartsoft.cc> wrote in message news:3C5AAA54.88A9D7CA smartsoft.cc...Check http://www.dosextender.com/ I think Doug Huffman put out a new version...
Feb 01 2002
I use assembler files with x all the time. You can view them at \dm\src\core32\*.asm and \dm\src\dos32\*.asm. Can I suggest taking your asm file and assembling it with nasm. Try it again using dmc's inline assembler. Obj2asm the results and compare! "Laurentiu Pancescu" <plaur crosswinds.net> wrote in message news:a3dhlt$f8m$1 digitaldaemon.com..."Walter" <walter digitalmars.com> wrote in message news:a3csdi$4ac$1 digitaldaemon.com...Try a simple hello world program with -mx and verify that works on your system. -Walter
It does, and even pretty large programs, both in C and C++. It's no
when it's only high-level source code. Problems arise when I try to use externally defined functions (I use NASM for portability reasons, and because of its cleaner syntax: I can use the same ASM file for any 32bit compiler, and virtually any operating system! It's very suitable for portable MMX or 3dnow! optimizations). It works with DMC in Win32 mode, so I think the problem is in X32, not in
code. I "heard" about some stack alignment problems when X32 runs under real-mode DOS: maybe they're not the only ones? Do you think I should contact Mr. Doug Hoffman about this issue? Laurentiu
Feb 01 2002
"Walter" <walter digitalmars.com> wrote in message news:a3fcef$rir$1 digitaldaemon.com...Can I suggest taking your asm file and assembling it with nasm. Try it
using dmc's inline assembler. Obj2asm the results and compare!
I tried to obj2asm the object generated by NASM: I only saw db lines there, instead of actual assembly code. So, I added 'class=CODE' in the segment declaration, and it's fine now. Probably X32 got GPF when calling code inside of a DATA segment. I don't understand why this is okay with all Windows compilers, including DMC, and also DJGPP (which also uses a DOS extender). Probably it's related to how the linker and the OS loader work?? Thanks, Laurentiu
Feb 02 2002
"Laurentiu Pancescu" <lpancescu fastmail.fm> wrote in message news:a3gdvj$1i17$1 digitaldaemon.com..."Walter" <walter digitalmars.com> wrote in message news:a3fcef$rir$1 digitaldaemon.com...Can I suggest taking your asm file and assembling it with nasm. Try it
using dmc's inline assembler. Obj2asm the results and compare!
I tried to obj2asm the object generated by NASM: I only saw db lines
instead of actual assembly code. So, I added 'class=CODE' in the segment declaration, and it's fine now. Probably X32 got GPF when calling code inside of a DATA segment. I don't understand why this is okay with all Windows compilers, including DMC, and also DJGPP (which also uses a DOS extender). Probably it's related to how the linker and the OS loader
Glad you found what was going wrong. The reason you got the crash is X32 marks the code segment as execute only, and the data as not executable. Other dos extenders apparently don't do that.
Feb 02 2002
Your solution is now in the FAQ! Thanks, -Walter "Laurentiu Pancescu" <lpancescu fastmail.fm> wrote in message news:a3gdvj$1i17$1 digitaldaemon.com..."Walter" <walter digitalmars.com> wrote in message news:a3fcef$rir$1 digitaldaemon.com...Can I suggest taking your asm file and assembling it with nasm. Try it
using dmc's inline assembler. Obj2asm the results and compare!
I tried to obj2asm the object generated by NASM: I only saw db lines
instead of actual assembly code. So, I added 'class=CODE' in the segment declaration, and it's fine now. Probably X32 got GPF when calling code inside of a DATA segment. I don't understand why this is okay with all Windows compilers, including DMC, and also DJGPP (which also uses a DOS extender). Probably it's related to how the linker and the OS loader
Thanks, Laurentiu
Feb 02 2002
"Walter" <walter digitalmars.com> wrote in message news:a3hkt8$232q$2 digitaldaemon.com...Your solution is now in the FAQ! Thanks, -Walter
Great, thanks! And I'm also glad because my MMX code works now fine with DMC. However, I notice that the performance of my loop is about 20% weaker than in the Borland or gcc cases (no external calls, only MOVQ, PXOR and POR!). I expect this to be the same for any compiler, since they don't touch it. Then, I tried to force an alignment to a paragraph border for my assembly function, but this only made things worse by an additional 10% - I guess OPTLINK knows better about alignments... :) Is it possible that the way different runtime libraries initialize the FPU affects the MMX performance (since both MMX and FPU instructions use the same physical registers)??? There's also a slight difference between Borland and gcc generated EXEs, about 2-3% - I don't see another reason. Laurentiu
Feb 03 2002
"Laurentiu Pancescu" <lpancescu fastmail.fm> wrote in message news:a3jgvm$2u58$2 digitaldaemon.com..."Walter" <walter digitalmars.com> wrote in message news:a3hkt8$232q$2 digitaldaemon.com...Your solution is now in the FAQ! Thanks, -Walter
Great, thanks! And I'm also glad because my MMX code works now fine with DMC. However, I notice that the performance of my loop is about 20%
than in the Borland or gcc cases (no external calls, only MOVQ, PXOR and POR!). I expect this to be the same for any compiler, since they don't touch it. Then, I tried to force an alignment to a paragraph border for
assembly function, but this only made things worse by an additional 10% -
guess OPTLINK knows better about alignments... :)
Alignment probably is the issue. Try putting in NOPs one at a time before your loop, and time each time.Is it possible that the way different runtime libraries initialize the FPU affects the MMX performance (since both MMX and FPU instructions use the same physical registers)??? There's also a slight difference between Borland and gcc generated EXEs, about 2-3% - I don't see another reason.
I can't imagine how that would affect things. If it does, please let me know!
Feb 03 2002
"Walter" <walter digitalmars.com> wrote in message news:a3knen$dig$1 digitaldaemon.com...I notice that the performance of my loop is about 20% weaker than in the Borland or gcc cases (no external calls, only MOVQ,
POR).
your loop, and time each time.
I did some testing, with very interesting results: when I specified -o+space for the compiling of the C source files, the C code performance dropped slightly, but the MMX loop performance is the same as in the EXEs generated by BCC or gcc (even slightly better). I'm really confused about this, since NASM handles my MMX loop in the same way each time, and I called OPTLINK directly, so that it doesn't know about requirements to do space optimization (just in case it cares about SC's -o+space). Even more, I got used to the fact that the corresponding DOSX program, compiled from the same source, runs about 5-10% slower than its Win32 counterpart, but now, with -o+space, it runs faster!!! I also did another test, using a source with a simple C loop, seen on one of BCC's newsgroups some months ago: - -o, -o+speed, -o+all: execution time is 13 seconds - no optimization flags specified: execution time is 4 seconds - -o+space: execution time is 3 seconds I thought -o+all is *always* the best to use, but it proves not to be the case... I can send you the sources for those two tests, if you want - perhaps it could help improving the optimizer? Laurentiu
Feb 04 2002
Since you said the critical loop is in the assembler code, it cannot be the optimizer. The optimizer does not affect the assembler. I bet it's alignment. Try the NOP suggestion. -Walter "Laurentiu Pancescu" <lpancescu fastmail.fm> wrote in message news:a3mo9g$1fp3$1 digitaldaemon.com..."Walter" <walter digitalmars.com> wrote in message news:a3knen$dig$1 digitaldaemon.com...I notice that the performance of my loop is about 20% weaker than in the Borland or gcc cases (no external calls, only MOVQ,
POR).
your loop, and time each time.
I did some testing, with very interesting results: when I
for the compiling of the C source files, the C code performance dropped slightly, but the MMX loop performance is the same as in the EXEs
by BCC or gcc (even slightly better). I'm really confused about this,
NASM handles my MMX loop in the same way each time, and I called OPTLINK directly, so that it doesn't know about requirements to do space optimization (just in case it cares about SC's -o+space). Even more, I
used to the fact that the corresponding DOSX program, compiled from the
source, runs about 5-10% slower than its Win32 counterpart, but now, with -o+space, it runs faster!!! I also did another test, using a source with a simple C loop, seen on one
BCC's newsgroups some months ago: - -o, -o+speed, -o+all: execution time is 13 seconds - no optimization flags specified: execution time is 4 seconds - -o+space: execution time is 3 seconds I thought -o+all is *always* the best to use, but it proves not to be the case... I can send you the sources for those two tests, if you want - perhaps it could help improving the optimizer? Laurentiu
Feb 04 2002
Walter schrieb...Since you said the critical loop is in the assembler code, it cannot be the optimizer. The optimizer does not affect the assembler. I bet it's alignment. Try the NOP suggestion. -Walter
Right. The code might fit into the processor cache in one case and not in the other depending on the starting address of the critical code. Due to optimization the assembly part can move to a base address that is not optimal for caching. Just a guess, Heinz
Feb 05 2002
"Walter" <walter digitalmars.com> wrote in message news:a3nvlq$2f0q$2 digitaldaemon.com...I bet it's alignment. Try the NOP suggestion. -Walter
You'd win the bet... almost! It was an alignment problem, indeed, not of the code, of the data that the MMX instructions access. Playing with NOP only improved performance by 2%, not significant when compared to a boost from 2.5 seconds to 1.8 (execution time). One of the operands of my intructions cannot be aligned, but the other one could. I used an automatic vector (char p[48]), declared in main(), and passed the pointer to that. The option "-o+all" determines p to be aligned at a 4-byte boundary, while "-o+space" makes p's alignment to be 8-byte boundary, which is vital for MMX performance. Both BCC and GCC align automatic vectors at 8 or 16 bytes by default, so this is where the performance penalty came from! I did more tests related to alignment in code generated by DMC and other compilers, but I will post a separate message in c++, since we're already pretty far away from the original crash of NASM generated code... :) Many thanks for your help and suggestions! Laurentiu
Feb 05 2002









"Laurentiu Pancescu" <plaur crosswinds.net> 