www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 14368] New: stdio.rawRead underperforms stdio

https://issues.dlang.org/show_bug.cgi?id=14368

          Issue ID: 14368
           Summary: stdio.rawRead underperforms stdio
           Product: D
           Version: D2
          Hardware: x86_64
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P1
         Component: Phobos
          Assignee: nobody puremagic.com
          Reporter: cooper.charles.m gmail.com

Performance of std.stdio.rawRead is 50-75% slower than core.std.stdio.fread in
tight loop. The performance of a thin wrapper should match C stdio performance
or users will be unhappy.

// stdioperf.d
struct mystruct {
    long data[4];
}
void main() {
    enum bool CSTDIO = false;
    mystruct foo;
    static if (CSTDIO) {
        import core.stdc.stdio : stdin,fread;
        while (0 != fread(&foo, foo.sizeof, 1, stdin)) {}
    } else {
        static import std.stdio;
        while (0 != std.stdio.stdin.rawRead((&foo)[0..1]).length) {}
    }
}
//EOF

$ dmd --version
DMD64 D Compiler v2.067.0
Copyright (c) 1999-2014 by Digital Mars written by Walter Bright
$ dmd -O -inline -release -noboundscheck stdioperf.d 
$ time dd if=/dev/zero bs=1M count=8192 | ./stdioperf 
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 7.0038 s, 1.2 GB/s

real    0m7.005s
user    0m5.792s
sys    0m6.924s


$ gdc --version
gdc (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gdc -O3 -fno-bounds-check -fno-assert -fno-invariants -fno-in -fno-out
stdioperf.d 
$ time dd if=/dev/zero bs=1M count=8192 | ./a.out 
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 6.07485 s, 1.4 GB/s

real    0m6.076s
user    0m4.908s
sys    0m6.684s

With CSTDIO = true (performance is same no matter the compiler):
$ gdc -O3 stdioperf.d 
$ time dd if=/dev/zero bs=1M count=8192 | ./a.out 
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB) copied, 4.18047 s, 2.1 GB/s

real    0m4.182s
user    0m2.888s
sys    0m3.888s


Profiling suggests the overhead comes from the compiler failing to inline calls
to std.exception.enforce, calling errnoEnforce even when fread's return
indicates success, and from buffer slicing overhead.

The following patch to d/4.9/std/stdio.d (front end D 2.065) confirms this,
reducing the performance gap to ~2% (gdc -O2). It also gets rid of the
undocumented null return value:
609c609,611
<         enforce(buffer.length, "rawRead must take a non-empty buffer");
---
 		if (!buffer.length) {
 			enforce(buffer.length, "rawRead must take a non-empty buffer");
 		}
625,626c627,631 < errnoEnforce(!error); < return result ? buffer[0 .. result] : null; ---
 		if (result < buffer.length) {
 			errnoEnforce(!error);
 			return buffer[0..result];
 		}
 		return buffer;
$ gdc -O3 stdioperf.d mystdio.d $ time dd if=/dev/zero bs=1M count=8192 | ./a.out 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 4.26723 s, 2.0 GB/s real 0m4.269s user 0m2.960s sys 0m3.788s The patch to dmd 2.067 phobos is similar except the line numbers are different: 715c715,717 < enforce(buffer.length, "rawRead must take a non-empty buffer"); ---
 		if (!buffer.length) {
 			enforce(false, "rawRead must take a non-empty buffer");
 		}
733,734c735,739 < errnoEnforce(!error); < return result ? buffer[0 .. result] : null; ---
 		if (result < buffer.length) {
 			errnoEnforce(!error);
 			return buffer[0..result];
 		}
 		return buffer;
I also suggest that stdio.File.rawRead also update the documentation of rawRead so that it includes an example of idiomatic usage: while (1) if (0 == rawRead(...).length) break; Charles --
Mar 28 2015