digitalmars.D - Why can't D language framework play best?

zoujiaqing (5/5) May 08 2020 We have been continuously optimizing the performance of hunt

Mathias LANG (6/11) May 08 2020 Looking at the data table, one thing stands out: Performances for
lili (2/7) May 08 2020 比前三差了10%，差距有点大啊， IO按道道理和D应该没�...
welkam (3/8) May 08 2020 If you could create a benchmark that anyone could easily set up
Jacob Carlborg (20/27) May 11 2020 I think 18 out of 404 (plain text) and 8 out of 409 (JSON serialization)...

welkam (29/30) May 11 2020 The problem with classes is that they introduce indirections.

IGotD- (17/47) May 11 2020 This is one of my biggest discontents with the D design, that

zoujiaqing <zoujiaqing gmail.com> writes:

We have been continuously optimizing the performance of hunt 
library io module, but have not been as high as AspCore、Rust and 
Java.

https://www.techempower.com/benchmarks/#section=test&runid=9e7a6863-b92e-4079-a2a9-324426369751&hw=ph&test=plaintext

What good ideas do you have to offer?

May 08 2020

Mathias LANG <geod24 gmail.com> writes:

On Friday, 8 May 2020 at 08:13:35 UTC, zoujiaqing wrote:
 We have been continuously optimizing the performance of hunt 
 library io module, but have not been as high as AspCore、Rust 
 and Java.

 https://www.techempower.com/benchmarks/#section=test&runid=9e7a6863-b92e-4079-a2a9-324426369751&hw=ph&test=plaintext

 What good ideas do you have to offer?

Looking at the data table, one thing stands out: Performances for 
the row of "256" (is it size ? Concurrent requests?) is lower by 
an order of magnitude.
Assuming it's a size, that might just be a badly sized buffer, or 
a wrong "small data" optimization, but only profiling can tell.

May 08 2020

lili <akozhao tencent.com> writes:

On Friday, 8 May 2020 at 08:13:35 UTC, zoujiaqing wrote:
 We have been continuously optimizing the performance of hunt 
 library io module, but have not been as high as AspCore、Rust 
 and Java.

 https://www.techempower.com/benchmarks/#section=test&runid=9e7a6863-b92e-4079-a2a9-324426369751&hw=ph&test=plaintext

 What good ideas do you have to offer?

比前三差了10%，差距有点大啊，
IO按道道理和D应该没关系，是不是线程模型导致的。

May 08 2020

welkam <wwwelkam gmail.com> writes:

On Friday, 8 May 2020 at 08:13:35 UTC, zoujiaqing wrote:
 We have been continuously optimizing the performance of hunt 
 library io module, but have not been as high as AspCore、Rust 
 and Java.

 https://www.techempower.com/benchmarks/#section=test&runid=9e7a6863-b92e-4079-a2a9-324426369751&hw=ph&test=plaintext

 What good ideas do you have to offer?

If you could create a benchmark that anyone could easily set up 
and run then I could provide concrete ideas or even patches.

May 08 2020

Jacob Carlborg <doob me.com> writes:

On 2020-05-08 10:13, zoujiaqing wrote:
 We have been continuously optimizing the performance of hunt library io 
 module, but have not been as high as AspCore、Rust and Java.
 
 https://www.techempower.com/benchmarks/#section=test&runid=9e7a6863-b92e-4079-a2a9-324426369751&
w=ph&test=plaintext 
 
 
 What good ideas do you have to offer?

I think 18 out of 404 (plain text) and 8 out of 409 (JSON serialization) 
is pretty good. Compare that with vibe.d, it's down on 139, or something 
like that.

I haven't used Hunt, but I did have a brief look at the code base. It 
seems very class centric. That means heap allocation (which are slow) 
and access through indirection, which are at least slower than a direct 
access. Keep in mind the D's GC is stop-the-world. Would be interesting 
to see a benchmark with the GC turned off. Or using multiple processes 
(assuming it's not already used) instead of multiple threads.

I think as much of possible should be based on structs. It always 
simpler to turn a value type into a reference type (by embedding it in a 
class) then doing the opposite.

I haven't done any benchmarks, but when it comes to allocations it 
sounds like a request local region allocator, possible backed with a 
free list, would be efficient. The region allocator could first use a 
static array as its buffer, then fall back to allocating on the heap 
when the static buffer is full.

-- 
/Jacob Carlborg

May 11 2020

welkam <wwwelkam gmail.com> writes:

On Monday, 11 May 2020 at 08:09:24 UTC, Jacob Carlborg wrote:
 It seems very class centric.

The problem with classes is that they introduce indirections. 
First they are reference type so any access to them goes trough a 
pointer. Second unless you marked methods as final method calls 
goes trough vtable  meaning that to execute a code on a peace of 
data it could require 2 pointer "derefrences".

I dont know how GC allocates memory but malloc gives 16 byte 
aligned pointers and if GC does the same then if your class size 
is not multiple of 16 you would get a lot of "padding" between 
your classes. Since processor loads 64 bytes at a time you are 
guarantee that padding will be loaded too wasting cache space and 
bandwidth. Better to use containers that put data in contiguous 
peace of memory.

You can get most of what class inheritance gives by using alias 
this on structs.
struct example {
     base foo;
     alias foo this;
}
struct base {//data}

Jacob is correct to point out that memory management should not 
be overlooked. DMD got a big improvement when it switched to bump 
the pointer allocator
https://www.drdobbs.com/cpp/increasing-compiler-speed-by-over-75/240158941

After this article I read another blog post that was inspired by 
this and preallocated a bunch of memory and saw over 100% speed 
improvement.

Because we dont have profile information our advice can only be 
vague and not specific.

May 11 2020

IGotD- <nise nise.com> writes:

On Monday, 11 May 2020 at 11:03:26 UTC, welkam wrote:
 On Monday, 11 May 2020 at 08:09:24 UTC, Jacob Carlborg wrote:
 It seems very class centric.

 The problem with classes is that they introduce indirections. 
 First they are reference type so any access to them goes trough 
 a pointer. Second unless you marked methods as final method 
 calls goes trough vtable  meaning that to execute a code on a 
 peace of data it could require 2 pointer "derefrences".

 I dont know how GC allocates memory but malloc gives 16 byte 
 aligned pointers and if GC does the same then if your class 
 size is not multiple of 16 you would get a lot of "padding" 
 between your classes. Since processor loads 64 bytes at a time 
 you are guarantee that padding will be loaded too wasting cache 
 space and bandwidth. Better to use containers that put data in 
 contiguous peace of memory.

 You can get most of what class inheritance gives by using alias 
 this on structs.
 struct example {
     base foo;
     alias foo this;
 }
 struct base {//data}

 Jacob is correct to point out that memory management should not 
 be overlooked. DMD got a big improvement when it switched to 
 bump the pointer allocator
 https://www.drdobbs.com/cpp/increasing-compiler-speed-by-over-75/240158941

 After this article I read another blog post that was inspired 
 by this and preallocated a bunch of memory and saw over 100% 
 speed improvement.

 Because we dont have profile information our advice can only be 
 vague and not specific.

This is one of my biggest discontents with the D design, that 
classes are forced to be reference types (it is actually a value 
type where the pointer is wrapped in the base object struct). The 
only way to have class expanded in the host class is to use 
struct, but struct is limited in many ways.

Classes expanded in host classes has worked well in C++ so I 
don't really understand the design decision here. Allocated 
member classes can sometimes be beneficial in terms of resource 
handling (used in several classes for example) but in the 
majority of the times it is better to expand in the class.

So if a class as several member classes, all of them needs to be 
allocated which takes its toll on the performance. It's kind of a 
Java-ish approach but it certainly has its performance drawbacks. 
There have been discussions to increase the capabilities of 
struct so that it can match class better so that people can use 
struct more but these changes have been denied.

May 11 2020

D Programming

C/C++ Programming

Other

digitalmars.D - Why can't D language framework play best?