digitalmars.D - writef too slow?

Helmut Leitner (37/37) Oct 15 2004 I took a timeout 0.78-0.98 and when I came back I had

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (14/20) Oct 15 2004 Shouldn't that be "string" ? "String" sounds like a class.

Helmut Leitner (12/22) Oct 15 2004 Depends on whether standards for this develop in the community.

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (3/8) Oct 15 2004 My own humble suggestion was string for char[] and ustring for wchar[]

Ben Hinkle (6/19) Oct 15 2004 A small correction: the doFormatCallback is actually in std.stream and n...

Walter (5/8) Oct 15 2004 It isn't the same, the one in std.stdio has an opimization for ASCII in ...

Derek (9/56) Oct 15 2004 I must be a *lot* more tolerant of /slow/ performance. On my machine, my
Sean Kelly (21/29) Oct 15 2004 writef is meant to be the D replacement for printf. It has support for
Ben Hinkle (27/27) Oct 15 2004 I pointed VTune at
Ben Hinkle (64/64) Oct 15 2004 I've attached a modified version of std.stdio that builds up a buffer be...

Ben Hinkle (6/18) Oct 15 2004 Looking at that code again my second modification in the non-ASCII cased...

Walter (9/26) Oct 15 2004 Replace

Ben Hinkle (8/36) Oct 16 2004 lock

Walter (7/17) Oct 16 2004 lock

Helmut Leitner <helmut.leitner wikiservice.at> writes:

I took a timeout 0.78-0.98 and when I came back I had 
missed whatever writef discussions there have been.

When I then, about a month ago, published the first dmake
version it contained lots of simple printf calls.
Contributors were quick to implement lots of useful
extensions and all of them also switched printf to writef
naturally. I was a bit ashamed for having missed that.

But now, as I'm preparing the next dmake version I did
a routine check for speed and found writef so slow in
comparision that I can hardly switch without hesitating:


  import std.stdio;
  import venus.benchmark;

  alias char[] String;

  int main(String [] args)
  {
    String s="World";
    double pi=3.14;

    DelegateBenchPrint(delegate void () { writef("writef - Hello, %s.
PI=%.2f!\n",s,pi); } ,"writef:
");
    DelegateBenchPrint(delegate void () { printf("printf - Hello, %.*s.
PI=%.2f!\n",s,pi); }
,"printf: ");
    return 0;
  }

gives 
  - about 3 usec for the printf version
  - about 12-16 usec for the writef version
when redirected to a file (900 MHz Athlon/Windows)

This factor 4-5 lets me hesitate to use writef, because 
it might actually make tools like dmake look bad.

What is the "official" situation with printf / writef?

Have you noticed a similar performance gap?

Is there a chance for a better implementation or is this
the OO price we just have to pay?

Are you willing to pay the prize?

-- 
Helmut Leitner    leitner hls.via.at
Graz, Austria   www.hls-software.com

Oct 15 2004

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Helmut Leitner wrote:

 But now, as I'm preparing the next dmake version I did
 a routine check for speed and found writef so slow in
 comparision that I can hardly switch without hesitating:

[...]
  alias char[] String;

Shouldn't that be "string" ? "String" sounds like a class.

 This factor 4-5 lets me hesitate to use writef, because 
 it might actually make tools like dmake look bad.

I believe that the conversion has something to do with it:

   private void doFormatCallback(dchar c)
   {
     char[4] buf;
     char[] b;
     b = std.utf.toUTF8(buf, c);
     writeString(b);
   }

This is called for each (UTF-32) character to be output...


So it sounds more like an Unicode issue than an OOP one ?

--anders

Oct 15 2004

Helmut Leitner <leitner hls.via.at> writes:

Anders F Bj�rklund wrote:
 
 Helmut Leitner wrote:
 
 But now, as I'm preparing the next dmake version I did
 a routine check for speed and found writef so slow in
 comparision that I can hardly switch without hesitating:

 [...]
  alias char[] String;

 
 Shouldn't that be "string" ? "String" sounds like a class.

Depends on whether standards for this develop in the community. 
One could interpret lowercase as:
  - builtin
  - atom / scalar
  - not an object
  .....

As far as I see, there will never be a String class in D, but
there will also never be an official statement about that...

-- 
Helmut Leitner    leitner hls.via.at
Graz, Austria   www.hls-software.com

Oct 15 2004

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Helmut Leitner wrote:

 alias char[] String;

Shouldn't that be "string" ? "String" sounds like a class.

 
 Depends on whether standards for this develop in the community. 

My own humble suggestion was string for char[] and ustring for wchar[]

--anders

Oct 15 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

 This factor 4-5 lets me hesitate to use writef, because
 it might actually make tools like dmake look bad.

 I believe that the conversion has something to do with it:

    private void doFormatCallback(dchar c)
    {
      char[4] buf;
      char[] b;
      b = std.utf.toUTF8(buf, c);
      writeString(b);
    }

 This is called for each (UTF-32) character to be output...


 So it sounds more like an Unicode issue than an OOP one ?

 --anders

A small correction: the doFormatCallback is actually in std.stream and not
std.stdio but the one in std.stream is essentially the same as the one in
std.stdio so it probably doesn't matter.

I would try running a profiler like VTune to see where the time is being
spent. Either that or throw in some performance monitoring code yourself.

-Ben

Oct 15 2004

"Walter" <newshound digitalmars.com> writes:

"Ben Hinkle" <bhinkle mathworks.com> wrote in message
news:ckokq8$1uv3$1 digitaldaemon.com...
 A small correction: the doFormatCallback is actually in std.stream and not
 std.stdio but the one in std.stream is essentially the same as the one in
 std.stdio so it probably doesn't matter.

It isn't the same, the one in std.stdio has an opimization for ASCII in it.
One probable culprit, however, is the fact that writef() does a lock for
each character output, whereas printf() does one lock for the whole write.

Oct 15 2004

Derek <derek psyc.ward> writes:

On Fri, 15 Oct 2004 13:05:57 +0200, Helmut Leitner wrote:

 I took a timeout 0.78-0.98 and when I came back I had 
 missed whatever writef discussions there have been.
 
 When I then, about a month ago, published the first dmake
 version it contained lots of simple printf calls.
 Contributors were quick to implement lots of useful
 extensions and all of them also switched printf to writef
 naturally. I was a bit ashamed for having missed that.
 
 But now, as I'm preparing the next dmake version I did
 a routine check for speed and found writef so slow in
 comparision that I can hardly switch without hesitating:
 
 
   import std.stdio;
   import venus.benchmark;
 
   alias char[] String;
 
   int main(String [] args)
   {
     String s="World";
     double pi=3.14;
 
     DelegateBenchPrint(delegate void () { writef("writef - Hello, %s.
PI=%.2f!\n",s,pi); } ,"writef:
 ");
     DelegateBenchPrint(delegate void () { printf("printf - Hello, %.*s.
PI=%.2f!\n",s,pi); }
 ,"printf: ");
     return 0;
   }
 
 gives 
   - about 3 usec for the printf version
   - about 12-16 usec for the writef version
 when redirected to a file (900 MHz Athlon/Windows)
 
 This factor 4-5 lets me hesitate to use writef, because 
 it might actually make tools like dmake look bad.
 
 What is the "official" situation with printf / writef?
 
 Have you noticed a similar performance gap?
 
 Is there a chance for a better implementation or is this
 the OO price we just have to pay?
 
 Are you willing to pay the prize?

I must be a *lot* more tolerant of /slow/ performance. On my machine, my
extended dmake, with full verbose mode, compiles all of Phobos and creates
a phobos library file in less than 2 seconds. Barely long enough for me to
take a sip of coffee. So I guess the /slow/ performance of writef is not
going to upset me too much.

-- 
Derek
Melbourne, Australia

Oct 15 2004

Sean Kelly <sean f4.ca> writes:

Helmut Leitner wrote:
 This factor 4-5 lets me hesitate to use writef, because 
 it might actually make tools like dmake look bad.
 
 What is the "official" situation with printf / writef?

writef is meant to be the D replacement for printf.  It has support for 
char, wchar, and dchar, as well as (I think) imaginary numbers.

 Have you noticed a similar performance gap?

Haven't tested it.

 Is there a chance for a better implementation or is this
 the OO price we just have to pay?

Looking at the code, I'm not sure it can get much more efficient. 
writef does a lot more than printf does, mostly to do with detecting the 
types of arguments.  It then passes the result off to C functions for 
further processing, so perhaps if the entire implementation were done in 
D code performance may improve a bit, but I'm skeptical.

 Are you willing to pay the prize?

C++ IOStreams have similar performance issues, but the price tends to be 
worth it for the type safety and such that they provide.  I feel the 
same way about writef, though obviously I'd prefer it be faster if 
possible.  FWIW, I've implemented the input side (readf) that could bear 
some testing.  The link is here:

http://home.f4.ca/sean/d/stdio.zip

I expect it may be even more than 4 times slower than scanf, as I 
convert everything to UTF-32 for processing.  Sadly, I don't see any 
easy way around this particular issue, as it seems the only completely 
safe way to do pattern matching.  If you've any suggestions, please let 
me know.


Sean

Oct 15 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

I pointed VTune at

import std.stdio;
alias char[] String;
void writeftest() {
    String s="World";
    double pi=3.14;
    writef("writef - Hello, %s. PI = %.2f.\n",s,pi);
}
void printftest() {
    String s="World";
    double pi=3.14;
    printf("printf - Hello, %.*s. PI = %.2f.\n",s,pi);
}
int main(String [] args)
  {
    String s="World";
    double pi=3.14;
    for (int k=0;k<10000;k++) writeftest();
    for (int k=0;k<10000;k++) printftest();
    return 0;
  }

and noticed that most of the time is being spent in __fp_lock and
__fp_unlock as writef calls fputc for every character and so it looks like
it has to lock and unlock the file reference every time. Probably some
buffering mechanism (using std.stream.BufferedFile or something) would get
the speed back to printf.

-Ben

Oct 15 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

I've attached a modified version of std.stdio that builds up a buffer before
calling file I/O. I'd be curious if it speeds up your code. Replace
   import std.stdio;
replaced with
   import stdio2;
and compile in stdio2.d with your application.

Note the attached file doesn't handle the wide case. I only did the fputc
case. Also it uses a buffer size of 32 as a guess - bigger might waste time
initializing array elements that aren't used.

-Ben


begin 666 stdio2.d

M;6%R<RYC;VT-"B J(%!L86-E9"!I;B!T:&4 4'5B;&EC($1O;6%I;BX-"B J

M<FEV871E(&EM<&]R="!S=&0N9F]R;6%T.PT*<')I=F%T92!I;7!O<G0 <W1D
M+G5T9CL-"F-O;G-T(&EN="!"=69F97)3:7IE(#T
M=F]I9"!W<FET97 H1DE,12H 9G L(%1Y<&5);F9O6UT 87)G=6UE;G1S+"!V
M;VED*B!A<F=P='(L(&EN="!N97=L:6YE*0T*>R  (&EN="!O<FEE;G1A=&EO
M;CL-"B  ("!C:&%R6T)U9F9E<E-I>F5=(&)U9F9E<CL-"B  ("!I;G0 ;B ]


M('9O:60 8VAE8VM?8G5F9F5R*"D >PT*("  ("  ("!I9B H;B ]/2!"=69F
M97)3:7IE*2 -"B  ("  ("  >PT*("  ("  ("  ("  9FQU<VA?8G5F9F5R



M<'5T8RAD8VAA<B!C*0T*"7L-" D ("  :68 *&, /#T ,' W1BD-" D ("  
M>PT*("  ("  ("  ("  ("  (&)U9F9E<EMN*RM=(#T 8V%S="AC:&%R*6,[





M("  ("  ("!B=69F97);;BLK72 ](&-A<W0H8VAA<BEC.PT*("  ("  ("  
M("  ("  ("  ("!C:&5C:U]B=69F97(H*3L-"B\O"0D ("  <W1D+F,N<W1D
M:6\N9G!U=&,H8EMI72P 9G I.PT*("  ("  ("  ("  ("  ('T-" D ("  
M?0T*"7T-" T*"7-T9"YF;W)M870N9&]&;W)M870H)G!U=&,L(&%R9W5M96YT



M<FEE;G1A=&EO; T*("  ('L-" EV97)S:6]N("A7:6YD;W=S*0T*"7L-" D 





M(#T


M;'-E('9E<G-I;VX *&QI;G5X*0T*"7L-" D ("  =F]I9"!P=71C=RAD8VAA



M+"!A<F=P='(I.PT*"6EF("AN97=L:6YE*0T*"2  ("!S=&0N8RYS=&1I;RYF




M(&9W<FET968H1DE,12H 9G L("XN+BD-"GL-"B  ("!W<FET97 H9G L(%]A

M24Q%*B!F<"P +BXN*0T*>PT*("  ('=R:71E>"AF<"P 7V%R9W5M96YT<RP 

`
end

Oct 15 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Ben Hinkle wrote:

 I've attached a modified version of std.stdio that builds up a buffer
 before calling file I/O. I'd be curious if it speeds up your code. Replace
    import std.stdio;
 replaced with
    import stdio2;
 and compile in stdio2.d with your application.
 
 Note the attached file doesn't handle the wide case. I only did the fputc
 case. Also it uses a buffer size of 32 as a guess - bigger might waste
 time initializing array elements that aren't used.
 
 -Ben

Looking at that code again my second modification in the non-ASCII cased is
hosed. I just copy-and-pasted and forgot to update the variable name. So
only call it with ASCII. I'll probably do some more experiments to see what
buffer size is best and send something to Walter.

-Ben

Oct 15 2004

"Walter" <newshound digitalmars.com> writes:

"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:ckplmr$2sr5$1 digitaldaemon.com...
 Ben Hinkle wrote:

 I've attached a modified version of std.stdio that builds up a buffer
 before calling file I/O. I'd be curious if it speeds up your code.


Replace
    import std.stdio;
 replaced with
    import stdio2;
 and compile in stdio2.d with your application.

 Note the attached file doesn't handle the wide case. I only did the


fputc
 case. Also it uses a buffer size of 32 as a guess - bigger might waste
 time initializing array elements that aren't used.

 -Ben

 Looking at that code again my second modification in the non-ASCII cased

is
 hosed. I just copy-and-pasted and forgot to update the variable name. So
 only call it with ASCII. I'll probably do some more experiments to see

what
 buffer size is best and send something to Walter.

I don't believe the buffer size is the problem. The problem is that the lock
should be surround all the calls to fputc, not embedded within fputc. Look
at the implementation of printf in \dm\src\core\printf.c.

Oct 15 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Walter" <newshound digitalmars.com> wrote in message
news:ckq78u$6eo$1 digitaldaemon.com...
 "Ben Hinkle" <bhinkle4 juno.com> wrote in message
 news:ckplmr$2sr5$1 digitaldaemon.com...
 Ben Hinkle wrote:

 I've attached a modified version of std.stdio that builds up a buffer
 before calling file I/O. I'd be curious if it speeds up your code.


 Replace
    import std.stdio;
 replaced with
    import stdio2;
 and compile in stdio2.d with your application.

 Note the attached file doesn't handle the wide case. I only did the


 fputc
 case. Also it uses a buffer size of 32 as a guess - bigger might waste
 time initializing array elements that aren't used.

 -Ben

 Looking at that code again my second modification in the non-ASCII cased

 is
 hosed. I just copy-and-pasted and forgot to update the variable name. So
 only call it with ASCII. I'll probably do some more experiments to see

 what
 buffer size is best and send something to Walter.

 I don't believe the buffer size is the problem. The problem is that the

lock
 should be surround all the calls to fputc, not embedded within fputc. Look
 at the implementation of printf in \dm\src\core\printf.c.

I just meant the buffer size shouldn't be so big that the average time spent
initialzing unused elements is greater than the average time spent flushing.
Off the top of my head that size is probably pretty big but who knows - it
is system dependent. I'll check out the printf code - where is it? I can't
find it in the dmd.zip download.

Oct 16 2004

"Walter" <newshound digitalmars.com> writes:

"Ben Hinkle" <bhinkle mathworks.com> wrote in message
news:cks3uo$1oao$1 digitaldaemon.com...
 "Walter" <newshound digitalmars.com> wrote in message
 news:ckq78u$6eo$1 digitaldaemon.com...
 I don't believe the buffer size is the problem. The problem is that the


lock
 should be surround all the calls to fputc, not embedded within fputc.


Look
 at the implementation of printf in \dm\src\core\printf.c.

 I just meant the buffer size shouldn't be so big that the average time

spent
 initialzing unused elements is greater than the average time spent

flushing.
 Off the top of my head that size is probably pretty big but who knows - it
 is system dependent. I'll check out the printf code - where is it? I can't
 find it in the dmd.zip download.

It's on the DMC++ CD in \dm\src\core\printf.c

Oct 16 2004

D Programming

C/C++ Programming

Other

digitalmars.D - writef too slow?