digitalmars.D - writef too slow?
- Helmut Leitner <helmut.leitner wikiservice.at> Oct 15 2004
- =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> Oct 15 2004
- Helmut Leitner <leitner hls.via.at> Oct 15 2004
- "Ben Hinkle" <bhinkle mathworks.com> Oct 15 2004
- "Walter" <newshound digitalmars.com> Oct 15 2004
- Derek <derek psyc.ward> Oct 15 2004
- Sean Kelly <sean f4.ca> Oct 15 2004
- "Ben Hinkle" <bhinkle mathworks.com> Oct 15 2004
- "Ben Hinkle" <bhinkle mathworks.com> Oct 15 2004
- Ben Hinkle <bhinkle4 juno.com> Oct 15 2004
- "Walter" <newshound digitalmars.com> Oct 15 2004
- "Ben Hinkle" <bhinkle mathworks.com> Oct 16 2004
- "Walter" <newshound digitalmars.com> Oct 16 2004
I took a timeout 0.78-0.98 and when I came back I had
missed whatever writef discussions there have been.
When I then, about a month ago, published the first dmake
version it contained lots of simple printf calls.
Contributors were quick to implement lots of useful
extensions and all of them also switched printf to writef
naturally. I was a bit ashamed for having missed that.
But now, as I'm preparing the next dmake version I did
a routine check for speed and found writef so slow in
comparision that I can hardly switch without hesitating:
import std.stdio;
import venus.benchmark;
alias char[] String;
int main(String [] args)
{
String s="World";
double pi=3.14;
DelegateBenchPrint(delegate void () { writef("writef - Hello, %s.
PI=%.2f!\n",s,pi); } ,"writef:
");
DelegateBenchPrint(delegate void () { printf("printf - Hello, %.*s.
PI=%.2f!\n",s,pi); }
,"printf: ");
return 0;
}
gives
- about 3 usec for the printf version
- about 12-16 usec for the writef version
when redirected to a file (900 MHz Athlon/Windows)
This factor 4-5 lets me hesitate to use writef, because
it might actually make tools like dmake look bad.
What is the "official" situation with printf / writef?
Have you noticed a similar performance gap?
Is there a chance for a better implementation or is this
the OO price we just have to pay?
Are you willing to pay the prize?
--
Helmut Leitner leitner hls.via.at
Graz, Austria www.hls-software.com
Oct 15 2004
Helmut Leitner wrote:But now, as I'm preparing the next dmake version I did a routine check for speed and found writef so slow in comparision that I can hardly switch without hesitating:
alias char[] String;
Shouldn't that be "string" ? "String" sounds like a class.This factor 4-5 lets me hesitate to use writef, because it might actually make tools like dmake look bad.
I believe that the conversion has something to do with it: private void doFormatCallback(dchar c) { char[4] buf; char[] b; b = std.utf.toUTF8(buf, c); writeString(b); } This is called for each (UTF-32) character to be output... So it sounds more like an Unicode issue than an OOP one ? --anders
Oct 15 2004
Anders F Björklund wrote:Helmut Leitner wrote:But now, as I'm preparing the next dmake version I did a routine check for speed and found writef so slow in comparision that I can hardly switch without hesitating:
alias char[] String;
Shouldn't that be "string" ? "String" sounds like a class.
Depends on whether standards for this develop in the community. One could interpret lowercase as: - builtin - atom / scalar - not an object ..... As far as I see, there will never be a String class in D, but there will also never be an official statement about that... -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Oct 15 2004
Helmut Leitner wrote:alias char[] String;
Shouldn't that be "string" ? "String" sounds like a class.
Depends on whether standards for this develop in the community.
My own humble suggestion was string for char[] and ustring for wchar[] --anders
Oct 15 2004
This factor 4-5 lets me hesitate to use writef, because it might actually make tools like dmake look bad.
I believe that the conversion has something to do with it: private void doFormatCallback(dchar c) { char[4] buf; char[] b; b = std.utf.toUTF8(buf, c); writeString(b); } This is called for each (UTF-32) character to be output... So it sounds more like an Unicode issue than an OOP one ? --anders
A small correction: the doFormatCallback is actually in std.stream and not std.stdio but the one in std.stream is essentially the same as the one in std.stdio so it probably doesn't matter. I would try running a profiler like VTune to see where the time is being spent. Either that or throw in some performance monitoring code yourself. -Ben
Oct 15 2004
"Ben Hinkle" <bhinkle mathworks.com> wrote in message news:ckokq8$1uv3$1 digitaldaemon.com...A small correction: the doFormatCallback is actually in std.stream and not std.stdio but the one in std.stream is essentially the same as the one in std.stdio so it probably doesn't matter.
It isn't the same, the one in std.stdio has an opimization for ASCII in it. One probable culprit, however, is the fact that writef() does a lock for each character output, whereas printf() does one lock for the whole write.
Oct 15 2004
On Fri, 15 Oct 2004 13:05:57 +0200, Helmut Leitner wrote:I took a timeout 0.78-0.98 and when I came back I had missed whatever writef discussions there have been. When I then, about a month ago, published the first dmake version it contained lots of simple printf calls. Contributors were quick to implement lots of useful extensions and all of them also switched printf to writef naturally. I was a bit ashamed for having missed that. But now, as I'm preparing the next dmake version I did a routine check for speed and found writef so slow in comparision that I can hardly switch without hesitating: import std.stdio; import venus.benchmark; alias char[] String; int main(String [] args) { String s="World"; double pi=3.14; DelegateBenchPrint(delegate void () { writef("writef - Hello, %s. PI=%.2f!\n",s,pi); } ,"writef: "); DelegateBenchPrint(delegate void () { printf("printf - Hello, %.*s. PI=%.2f!\n",s,pi); } ,"printf: "); return 0; } gives - about 3 usec for the printf version - about 12-16 usec for the writef version when redirected to a file (900 MHz Athlon/Windows) This factor 4-5 lets me hesitate to use writef, because it might actually make tools like dmake look bad. What is the "official" situation with printf / writef? Have you noticed a similar performance gap? Is there a chance for a better implementation or is this the OO price we just have to pay? Are you willing to pay the prize?
I must be a *lot* more tolerant of /slow/ performance. On my machine, my extended dmake, with full verbose mode, compiles all of Phobos and creates a phobos library file in less than 2 seconds. Barely long enough for me to take a sip of coffee. So I guess the /slow/ performance of writef is not going to upset me too much. -- Derek Melbourne, Australia
Oct 15 2004
Helmut Leitner wrote:This factor 4-5 lets me hesitate to use writef, because it might actually make tools like dmake look bad. What is the "official" situation with printf / writef?
writef is meant to be the D replacement for printf. It has support for char, wchar, and dchar, as well as (I think) imaginary numbers.Have you noticed a similar performance gap?
Haven't tested it.Is there a chance for a better implementation or is this the OO price we just have to pay?
Looking at the code, I'm not sure it can get much more efficient. writef does a lot more than printf does, mostly to do with detecting the types of arguments. It then passes the result off to C functions for further processing, so perhaps if the entire implementation were done in D code performance may improve a bit, but I'm skeptical.Are you willing to pay the prize?
C++ IOStreams have similar performance issues, but the price tends to be worth it for the type safety and such that they provide. I feel the same way about writef, though obviously I'd prefer it be faster if possible. FWIW, I've implemented the input side (readf) that could bear some testing. The link is here: http://home.f4.ca/sean/d/stdio.zip I expect it may be even more than 4 times slower than scanf, as I convert everything to UTF-32 for processing. Sadly, I don't see any easy way around this particular issue, as it seems the only completely safe way to do pattern matching. If you've any suggestions, please let me know. Sean
Oct 15 2004
I pointed VTune at
import std.stdio;
alias char[] String;
void writeftest() {
String s="World";
double pi=3.14;
writef("writef - Hello, %s. PI = %.2f.\n",s,pi);
}
void printftest() {
String s="World";
double pi=3.14;
printf("printf - Hello, %.*s. PI = %.2f.\n",s,pi);
}
int main(String [] args)
{
String s="World";
double pi=3.14;
for (int k=0;k<10000;k++) writeftest();
for (int k=0;k<10000;k++) printftest();
return 0;
}
and noticed that most of the time is being spent in __fp_lock and
__fp_unlock as writef calls fputc for every character and so it looks like
it has to lock and unlock the file reference every time. Probably some
buffering mechanism (using std.stream.BufferedFile or something) would get
the speed back to printf.
-Ben
Oct 15 2004
I've attached a modified version of std.stdio that builds up a buffer before calling file I/O. I'd be curious if it speeds up your code. Replace import std.stdio; replaced with import stdio2; and compile in stdio2.d with your application. Note the attached file doesn't handle the wide case. I only did the fputc case. Also it uses a buffer size of 32 as a guess - bigger might waste time initializing array elements that aren't used. -Ben
Oct 15 2004
Ben Hinkle wrote:I've attached a modified version of std.stdio that builds up a buffer before calling file I/O. I'd be curious if it speeds up your code. Replace import std.stdio; replaced with import stdio2; and compile in stdio2.d with your application. Note the attached file doesn't handle the wide case. I only did the fputc case. Also it uses a buffer size of 32 as a guess - bigger might waste time initializing array elements that aren't used. -Ben
Looking at that code again my second modification in the non-ASCII cased is hosed. I just copy-and-pasted and forgot to update the variable name. So only call it with ASCII. I'll probably do some more experiments to see what buffer size is best and send something to Walter. -Ben
Oct 15 2004
"Ben Hinkle" <bhinkle4 juno.com> wrote in message news:ckplmr$2sr5$1 digitaldaemon.com...Ben Hinkle wrote:I've attached a modified version of std.stdio that builds up a buffer before calling file I/O. I'd be curious if it speeds up your code.
import std.stdio; replaced with import stdio2; and compile in stdio2.d with your application. Note the attached file doesn't handle the wide case. I only did the
case. Also it uses a buffer size of 32 as a guess - bigger might waste time initializing array elements that aren't used. -Ben
Looking at that code again my second modification in the non-ASCII cased
hosed. I just copy-and-pasted and forgot to update the variable name. So only call it with ASCII. I'll probably do some more experiments to see
buffer size is best and send something to Walter.
I don't believe the buffer size is the problem. The problem is that the lock should be surround all the calls to fputc, not embedded within fputc. Look at the implementation of printf in \dm\src\core\printf.c.
Oct 15 2004
"Walter" <newshound digitalmars.com> wrote in message news:ckq78u$6eo$1 digitaldaemon.com..."Ben Hinkle" <bhinkle4 juno.com> wrote in message news:ckplmr$2sr5$1 digitaldaemon.com...Ben Hinkle wrote:I've attached a modified version of std.stdio that builds up a buffer before calling file I/O. I'd be curious if it speeds up your code.
import std.stdio; replaced with import stdio2; and compile in stdio2.d with your application. Note the attached file doesn't handle the wide case. I only did the
case. Also it uses a buffer size of 32 as a guess - bigger might waste time initializing array elements that aren't used. -Ben
Looking at that code again my second modification in the non-ASCII cased
hosed. I just copy-and-pasted and forgot to update the variable name. So only call it with ASCII. I'll probably do some more experiments to see
buffer size is best and send something to Walter.
I don't believe the buffer size is the problem. The problem is that the
should be surround all the calls to fputc, not embedded within fputc. Look at the implementation of printf in \dm\src\core\printf.c.
I just meant the buffer size shouldn't be so big that the average time spent initialzing unused elements is greater than the average time spent flushing. Off the top of my head that size is probably pretty big but who knows - it is system dependent. I'll check out the printf code - where is it? I can't find it in the dmd.zip download.
Oct 16 2004
"Ben Hinkle" <bhinkle mathworks.com> wrote in message news:cks3uo$1oao$1 digitaldaemon.com..."Walter" <newshound digitalmars.com> wrote in message news:ckq78u$6eo$1 digitaldaemon.com...I don't believe the buffer size is the problem. The problem is that the
should be surround all the calls to fputc, not embedded within fputc.
at the implementation of printf in \dm\src\core\printf.c.
I just meant the buffer size shouldn't be so big that the average time
initialzing unused elements is greater than the average time spent
Off the top of my head that size is probably pretty big but who knows - it is system dependent. I'll check out the printf code - where is it? I can't find it in the dmd.zip download.
It's on the DMC++ CD in \dm\src\core\printf.c
Oct 16 2004









=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> 