www.digitalmars.com         C & C++   DMDScript  

c++.windows.32-bits - console apps slowdown

reply "Laurentiu Pancescu" <user domain.invalid> writes:
Hello!

I've written a floating point test, as a console application.  I do not
display anything during calculations, only the result and elapsed time,
after it completes.  The big surprise is that if I run it under WinME
console window, it's 50% slower than if I run same EXE under "rxvt for
Win32" from MSYS package (www.mingw.org).  And this happens with any console
mode Win32 application, not only the ones that are compiled with DMC.  Now
the weird part: DOS extended applications run with the same speed under both
shells, there's no additional slow-down in the Win32 console.  I've also
tried with Cygwin bash, and the slowdown is the same, therefore I assume
it's related to the console emulation, and not to the fact that COMMAND.COM
is a 16-bit DOS application.  Switching to full-screen doesn't change
anything, either.

Did anyone else notice this problem?  Does this also happen on WinNT family
(Win2000/XP)?

Regards,
  Laurentiu
May 26 2002
parent reply "Walter" <walter digitalmars.com> writes:
It could be a stack alignment issue. Check that the stack is aligned to 16
bytes on both. -Walter

"Laurentiu Pancescu" <user domain.invalid> wrote in message
news:acqu7p$k8g$1 digitaldaemon.com...
 Hello!

 I've written a floating point test, as a console application.  I do not
 display anything during calculations, only the result and elapsed time,
 after it completes.  The big surprise is that if I run it under WinME
 console window, it's 50% slower than if I run same EXE under "rxvt for
 Win32" from MSYS package (www.mingw.org).  And this happens with any
console
 mode Win32 application, not only the ones that are compiled with DMC.  Now
 the weird part: DOS extended applications run with the same speed under
both
 shells, there's no additional slow-down in the Win32 console.  I've also
 tried with Cygwin bash, and the slowdown is the same, therefore I assume
 it's related to the console emulation, and not to the fact that
COMMAND.COM
 is a 16-bit DOS application.  Switching to full-screen doesn't change
 anything, either.

 Did anyone else notice this problem?  Does this also happen on WinNT
family
 (Win2000/XP)?

 Regards,
   Laurentiu
May 26 2002
parent reply "Laurentiu Pancescu" <user domain.invalid> writes:
Thanks, Walter!  The difference seems unrelated to stack alignment, but I
noticed that the DMC stack alignment problem isn't completely solved (DMC
tries to keep 8-byte alignment, AFAIK).  Please try the enclosed demo:

"sc -o -6 -ff int1 main"  (or int2, or int3, it doesn't matter - all of them
are affected)

Here's the output of int1.exe:

Stack of main() 0x64fde8
Stack of integrate: 0x64fda4 --> misalignment generated by main()
Result is 6.93147e+06
Elapsed time: 1.719 s

If I compile for X32 (sc -o -6 -ff -mx int1 main x32.lib), I get:

Stack of main() 0xfff98fdc  --> note misalignment, maybe we should ask Doug
for para alignment?
Stack of integrate: 0xfff98f90 --> next misalignment makes things right  :)
Result is 6.93147e+006
Elapsed time: 1.32 s

It seems optimizing main() does something bad - everything is fine if you
compile main.cpp with no optimizations.

Best regards,
  Laurentiu



"Walter" <walter digitalmars.com> schrieb im Newsbeitrag
news:acr3tk$ouk$1 digitaldaemon.com...
 It could be a stack alignment issue. Check that the stack is aligned to 16
 bytes on both. -Walter

 "Laurentiu Pancescu" <user domain.invalid> wrote in message
 news:acqu7p$k8g$1 digitaldaemon.com...
begin 666 laur_align.zip MHB6]FD6*N;X^Y]E>GH6YAE*>K488ZC%HVTT2I,PKB[&H4K6K8-5*%R4'O&G' M67\ 5HP\"CC/L\\\0X)6.^.,IFY.KK ;4$A[;0W:%H<7(_H;].A;R\LBC'S" MP2/L7=. M\['Y]+-4Q!.NH%C+SU"80V3N3BLN0>1^A<GK?9!$H0"K .#Q < MQ\CGP6N1QBXJ"(Z1XH&OOY<95;K+1]PSB1_KQHO]4SHPR3?VOM9_SOK;>?L- M<'"-44%N S 0O"/QAU$J53A0DEP)Z;V'GOJ RH!)K((=&3M*5?7OM5U3EQZJ MG%COSLX,LW=<M(/I&&HN)ZT8'<O38YK$]DCUR;?2I!WH-.%):'945$N5)A]I MAWFE%X&&X("M17_NG8TPC3ZJZF9)9SDZG<SHF$O'C8CE_ME*HU'76+UHVKY! M]N!!D558N4EVD;S#FMQ;'O=FHAO\:B\5,NY- Z,.XK;.<^+F\,KYP?TA10Z. M-;(&#Z M=N76.7*?'%=/$UEB4K<>)3(O'&2D]*>EA=T+\*#U!5!+`P04````" `!H+HL M+#W^R/D```#G MZ $J`R:Q"G8$=A4IZMWK,49LN_*;F??QS(LVW>A[!3%)=ZMOESS;6]HN;E9R MBNT\<VJZC](IT5O?C JL&#A MS[ 1%C_AC*9N3M3LK'<0`H=/)[MOV(%LU74.Z1\XT(3]6-VCX*]!2+4R_4C2 M/4;'<K SF";K$S1$" ^X+#G-$9/+\[J-1 F-`JS%&R3'<?ML])J5\[.) F.B MPS 4W"/E'TY!0C$*$-8V[5)U*A,P(E6N_5JL.G85VRR(?R>.706$Y>&]\]V] M\[M11N "9VRS _$^X>/=5G,L%<]35!92!L.FJ",I]/ /=49X UR=6 0C%,G M0Q(7.Q(=6T;E6*'GRM2L++[*`E?^0"YHOXR(T%:<]QY'-3C?P)&P1J87&SRZ M#M6KY^(,>\Q6J"):?UHE<<=NDU>$R$ =E<D+J^1=LPG+M-6O;[0-GL;;II-H M:?ZL_9OD)9DHES+\'SU3MYI?'$G$/2YRY+PX=IURGY(R/&+SO-N_;7:3%.[= M5#FT#X-!.S;?<9\_4$L!`A0`% ```` `])ZZ+%,8#FKQ````Y $``` ````` M% ```` `2Z"Z+$UL5JCQ````M $``` ```````````` ````AP,``&UA:6XN 98W!P4$L%! `````$``0`V ```)X$```````` ` end
May 26 2002
parent reply "Walter" <walter digitalmars.com> writes:
Ok, I'll check it out. -Walter

"Laurentiu Pancescu" <user domain.invalid> wrote in message
news:acr9e6$t9k$1 digitaldaemon.com...
 Thanks, Walter!  The difference seems unrelated to stack alignment, but I
 noticed that the DMC stack alignment problem isn't completely solved (DMC
 tries to keep 8-byte alignment, AFAIK).  Please try the enclosed demo:

 "sc -o -6 -ff int1 main"  (or int2, or int3, it doesn't matter - all of
them
 are affected)

 Here's the output of int1.exe:

 Stack of main() 0x64fde8
 Stack of integrate: 0x64fda4 --> misalignment generated by main()
 Result is 6.93147e+06
 Elapsed time: 1.719 s

 If I compile for X32 (sc -o -6 -ff -mx int1 main x32.lib), I get:

 Stack of main() 0xfff98fdc  --> note misalignment, maybe we should ask
Doug
 for para alignment?
 Stack of integrate: 0xfff98f90 --> next misalignment makes things right
:)
 Result is 6.93147e+006
 Elapsed time: 1.32 s

 It seems optimizing main() does something bad - everything is fine if you
 compile main.cpp with no optimizations.

 Best regards,
   Laurentiu



 "Walter" <walter digitalmars.com> schrieb im Newsbeitrag
 news:acr3tk$ouk$1 digitaldaemon.com...
 It could be a stack alignment issue. Check that the stack is aligned to
16
 bytes on both. -Walter

 "Laurentiu Pancescu" <user domain.invalid> wrote in message
 news:acqu7p$k8g$1 digitaldaemon.com...
May 26 2002
next sibling parent "Laurentiu Pancescu" <lpancescu fastmail.fm> writes:
Hello Walter!

I forgot to tell you my DMC version: 8.28, I downloaded it yesterday (CD
update).  It seems stack alignment works fine if I put everything in a
single CPP file.

In this example, the performance improvement with proper alignment isn't
that big, as you can see from X32 program output; probably because
everything fits in the microprocessor's cache, and misaligned memory
accesses are not performed inside the calcation loop.  Just a guess... :)

Regards,
  Laurentiu


"Walter" <walter digitalmars.com> wrote in message
news:acsikl$27pe$1 digitaldaemon.com...
 Ok, I'll check it out. -Walter

 "Laurentiu Pancescu" <user domain.invalid> wrote in message
 news:acr9e6$t9k$1 digitaldaemon.com...
 Thanks, Walter!  The difference seems unrelated to stack alignment, but
I
 noticed that the DMC stack alignment problem isn't completely solved
(DMC
 tries to keep 8-byte alignment, AFAIK).  Please try the enclosed demo:

 "sc -o -6 -ff int1 main"  (or int2, or int3, it doesn't matter - all of
them
 are affected)

 Here's the output of int1.exe:

 Stack of main() 0x64fde8
 Stack of integrate: 0x64fda4 --> misalignment generated by main()
 Result is 6.93147e+06
 Elapsed time: 1.719 s

 If I compile for X32 (sc -o -6 -ff -mx int1 main x32.lib), I get:

 Stack of main() 0xfff98fdc  --> note misalignment, maybe we should ask
Doug
 for para alignment?
 Stack of integrate: 0xfff98f90 --> next misalignment makes things right
:)
 Result is 6.93147e+006
 Elapsed time: 1.32 s

 It seems optimizing main() does something bad - everything is fine if
you
 compile main.cpp with no optimizations.

 Best regards,
   Laurentiu



 "Walter" <walter digitalmars.com> schrieb im Newsbeitrag
 news:acr3tk$ouk$1 digitaldaemon.com...
 It could be a stack alignment issue. Check that the stack is aligned
to
 16
 bytes on both. -Walter

 "Laurentiu Pancescu" <user domain.invalid> wrote in message
 news:acqu7p$k8g$1 digitaldaemon.com...
May 27 2002
prev sibling parent reply "Laurentiu Pancescu" <user nowhere.near> writes:
"Walter" <walter digitalmars.com> wrote in message
news:acsikl$27pe$1 digitaldaemon.com...
 Ok, I'll check it out. -Walter
Gee... I hope you didn't get angry with me about this! At least, does it also happen on your machine, when compiling my test programs? I've noticed that stack is correctly aligned in many more programs (Win32 only, X32 it something different), so I'm not sure this wasn't just a special case. Laurentiu
Jun 03 2002
parent reply "Walter" <walter digitalmars.com> writes:
"Laurentiu Pancescu" <user nowhere.near> wrote in message
news:adf86b$p60$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:acsikl$27pe$1 digitaldaemon.com...
 Ok, I'll check it out. -Walter
Gee... I hope you didn't get angry with me about this! At least, does it also happen on your machine, when compiling my test programs? I've
noticed
 that stack is correctly aligned in many more programs (Win32 only, X32 it
 something different), so I'm not sure this wasn't just a special case.
I'm annoyed with myself, not you, for there being a bug in the alignment process. I'm just glad you took the time to point it out and prepare a test case for me.
Jun 03 2002
parent reply "Laurentiu Pancescu" <user domain.invalid> writes:
You got me scared for a moment! <g>.  I didn't create those small demos for
testing alignment; I just read a nice article at www.oonumerics.org, about
different strategies to achieve in C++ similar performance with that of
FORTRAN.  And I wrote those 3 small programs, and compared the results (also
between different compilers).

I hope you'll find that bug...

Oh, related to X32: do you have some docs related to what's required from an
X32 drop-in replacement?  It seems the extender is no longer maintained, and
maybe we could come up with an open-source, state-of-the-art one.

Regards,
  Laurentiu


"Walter" <walter digitalmars.com> wrote in message
news:adg5dm$1ot1$2 digitaldaemon.com...
 "Laurentiu Pancescu" <user nowhere.near> wrote in message
 news:adf86b$p60$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:acsikl$27pe$1 digitaldaemon.com...
 Ok, I'll check it out. -Walter
Gee... I hope you didn't get angry with me about this! At least, does
it
 also happen on your machine, when compiling my test programs?  I've
noticed
 that stack is correctly aligned in many more programs (Win32 only, X32
it
 something different), so I'm not sure this wasn't just a special case.
I'm annoyed with myself, not you, for there being a bug in the alignment process. I'm just glad you took the time to point it out and prepare a
test
 case for me.
Jun 03 2002
parent "Walter" <walter digitalmars.com> writes:
"Laurentiu Pancescu" <user domain.invalid> wrote in message
news:adge5s$226v$2 digitaldaemon.com...
 You got me scared for a moment! <g>.  I didn't create those small demos
for
 testing alignment; I just read a nice article at www.oonumerics.org, about
 different strategies to achieve in C++ similar performance with that of
 FORTRAN.  And I wrote those 3 small programs, and compared the results
(also
 between different compilers).

 I hope you'll find that bug...

 Oh, related to X32: do you have some docs related to what's required from
an
 X32 drop-in replacement?  It seems the extender is no longer maintained,
and
 maybe we could come up with an open-source, state-of-the-art one.
There aren't any docs other than the library source code.
Jun 03 2002