digitalmars.D - eventcore vs boost.asio performance?

zoujiaqing (7/7) Feb 19 2023 eventcore is a very good and stable network library based on the

=?UTF-8?Q?S=c3=b6nke_Ludwig?= (22/30) Feb 20 2023 I'm not sure where it would be today on that list, but I got pretty

Daniel Kozak (5/12) Feb 20 2023 Last time I checked the main reason why vibed was slower has been becaus...

tchaloupka (8/12) Feb 21 2023 I've compared what syscalls various frameworks generates and by

Daniel Kozak (4/17) Feb 21 2023 Yes, you are right I have changed that too when I have been trying to ma...

zoujiaqing (6/31) Mar 05 2023 First, the io_uring is very much to look forward to! When can you

=?UTF-8?Q?S=c3=b6nke_Ludwig?= (16/48) Mar 06 2023 It doesn't pass the tests and has conflicts, so it needs some work. I

ikod (11/18) Mar 08 2023 There is few simple ideas behind nbuff:

zoujiaqing <zoujiaqing gmail.com> writes:

eventcore is a very good and stable network library based on the 
Proactor model.
These high-performing C++ frameworks all use asio.
What can we achieve if we use eventcore as network io?

  * https://github.com/vibe-d/eventcore
  * 
https://www.techempower.com/benchmarks/#section=data-r21&hw=cl&test=plaintext

Feb 19 2023

=?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig outerproduct.org> writes:

Am 19.02.2023 um 20:53 schrieb zoujiaqing:
 eventcore is a very good and stable network library based on the 
 Proactor model.
 These high-performing C++ frameworks all use asio.
 What can we achieve if we use eventcore as network io?
 
   * https://github.com/vibe-d/eventcore
   * 
 https://www.techempower.com/benchmarks/#section=data-r21&hw=cl&test=plaintext

I'm not sure where it would be today on that list, but I got pretty 
competitive results for local tests on Linux a few years back. However, 
there are at least two performance related issues still present:

- The API uses internal allocation of memory for I/O operations and per 
socket descriptor. This is to work around the lack of a way to disable 
struct moves (and thus making it unsafe to store any kind of pointer to 
stack values). The allocations are pretty well optimized, but it does 
lead to some additional memory copies that impede performance.

- On both, Linux and Windows, there are new, faster I/O APIs: io_uring 
and RIO. A PR by Tobias Pankrath 
(https://github.com/vibe-d/eventcore/pull/175) for io_uring exists, but 
it still needs to be finished.

Another thing that needs to be tackled is better error propagation. 
Right now, there is often just a generic "error" status, without the 
possibility to get a more detailed error code or message.

By the way, although the vibe.d HTTP implementation naturally adds some 
overhead over the raw network I/O, the vibe.d results in that list, 
judging by their poor performance on many-core machines, appear to be 
affected by GC runs, or possibly some other lock contention, whereas the 
basic HTTP request handling should be more or less GC-free. So those 
shouldn't be used for comparison.

Feb 20 2023

Daniel Kozak <kozzi11 gmail.com> writes:

On Mon, Feb 20, 2023 at 9:30 AM S=C3=B6nke Ludwig via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 ...

 By the way, although the vibe.d HTTP implementation naturally adds some
 overhead over the raw network I/O, the vibe.d results in that list,
 judging by their poor performance on many-core machines, appear to be
 affected by GC runs, or possibly some other lock contention, whereas the
 basic HTTP request handling should be more or less GC-free. So those
 shouldn't be used for comparison.

Last time I checked the main reason why vibed was slower has been because
of HTTP parsing. vibe-core with manual http parsing has been the same fast
as all other fastest alternatives.

Feb 20 2023

tchaloupka <chalucha gmail.com> writes:

On Monday, 20 February 2023 at 09:12:37 UTC, Daniel Kozak wrote:
 Last time I checked the main reason why vibed was slower has 
 been because of HTTP parsing. vibe-core with manual http 
 parsing has been the same fast as all other fastest 
 alternatives.

I've compared what syscalls various frameworks generates and by 
far the most difference makes that in vibe-d response header and 
body are written in two separate syscalls (tested on linux with 
epoll). That makes a pretty huge difference of about 30% if I 
remember correctly. Eventcore itself is not slow and is 
comparable with the top ones.

Tom

Feb 21 2023

Daniel Kozak <kozzi11 gmail.com> writes:

On Tue, Feb 21, 2023 at 10:45 AM tchaloupka via Digitalmars-d <
digitalmars-d puremagic.com> wrote:

 On Monday, 20 February 2023 at 09:12:37 UTC, Daniel Kozak wrote:
 Last time I checked the main reason why vibed was slower has
 been because of HTTP parsing. vibe-core with manual http
 parsing has been the same fast as all other fastest
 alternatives.

 I've compared what syscalls various frameworks generates and by
 far the most difference makes that in vibe-d response header and
 body are written in two separate syscalls (tested on linux with
 epoll). That makes a pretty huge difference of about 30% if I
 remember correctly. Eventcore itself is not slow and is
 comparable with the top ones.

Yes, you are right I have changed that too when I have been trying to make
vibed as fast as possible.


 Tom

Feb 21 2023

zoujiaqing <zoujiaqing gmail.com> writes:

On Monday, 20 February 2023 at 08:26:23 UTC, Sönke Ludwig wrote:
 I'm not sure where it would be today on that list, but I got 
 pretty competitive results for local tests on Linux a few years 
 back. However, there are at least two performance related 
 issues still present:

 - The API uses internal allocation of memory for I/O operations 
 and per socket descriptor. This is to work around the lack of a 
 way to disable struct moves (and thus making it unsafe to store 
 any kind of pointer to stack values). The allocations are 
 pretty well optimized, but it does lead to some additional 
 memory copies that impede performance.

 - On both, Linux and Windows, there are new, faster I/O APIs: 
 io_uring and RIO. A PR by Tobias Pankrath 
 (https://github.com/vibe-d/eventcore/pull/175) for io_uring 
 exists, but it still needs to be finished.

 Another thing that needs to be tackled is better error 
 propagation. Right now, there is often just a generic "error" 
 status, without the possibility to get a more detailed error 
 code or message.

 By the way, although the vibe.d HTTP implementation naturally 
 adds some overhead over the raw network I/O, the vibe.d results 
 in that list, judging by their poor performance on many-core 
 machines, appear to be affected by GC runs, or possibly some 
 other lock contention, whereas the basic HTTP request handling 
 should be more or less GC-free. So those shouldn't be used for 
 comparison.

First, the io_uring is very much to look forward to! When can you 
merge this PR?

Secondly, how to optimize memory allocation and release under 
high concurrency? Nbuff is a great library, and I've used it 
before.

Mar 05 2023

=?UTF-8?Q?S=c3=b6nke_Ludwig?= <sludwig outerproduct.org> writes:

Am 05.03.2023 um 16:14 schrieb zoujiaqing:
 On Monday, 20 February 2023 at 08:26:23 UTC, Sönke Ludwig wrote:
 I'm not sure where it would be today on that list, but I got pretty 
 competitive results for local tests on Linux a few years back. 
 However, there are at least two performance related issues still present:

 - The API uses internal allocation of memory for I/O operations and 
 per socket descriptor. This is to work around the lack of a way to 
 disable struct moves (and thus making it unsafe to store any kind of 
 pointer to stack values). The allocations are pretty well optimized, 
 but it does lead to some additional memory copies that impede 
 performance.

 - On both, Linux and Windows, there are new, faster I/O APIs: io_uring 
 and RIO. A PR by Tobias Pankrath 
 (https://github.com/vibe-d/eventcore/pull/175) for io_uring exists, 
 but it still needs to be finished.

 Another thing that needs to be tackled is better error propagation. 
 Right now, there is often just a generic "error" status, without the 
 possibility to get a more detailed error code or message.

 By the way, although the vibe.d HTTP implementation naturally adds 
 some overhead over the raw network I/O, the vibe.d results in that 
 list, judging by their poor performance on many-core machines, appear 
 to be affected by GC runs, or possibly some other lock contention, 
 whereas the basic HTTP request handling should be more or less 
 GC-free. So those shouldn't be used for comparison.

 
 First, the io_uring is very much to look forward to! When can you merge 
 this PR?

It doesn't pass the tests and has conflicts, so it needs some work. I 
could look into that, too, but I don't have much time available.

 Secondly, how to optimize memory allocation and release under high 
 concurrency? Nbuff is a great library, and I've used it before.

Apart for accidental allocations in the timer code, there are very few 
allocations in eventcore itself. The allocation scheme exploits the 
small integer nature of Posix handles and keeps a fixed-size slot per 
file descriptor in an array of arrays. Buffer allocations for read/write 
operations are the responsibility of the library user.

In that regard, nbuff should be usable for high-level data buffers just 
fine and although I haven't used it, it sounds like a very interesting 
concept in terms of using range interfaces with network data.

For vibe.d's HTTP server module, I'm using a free list based allocation 
scheme, where each requests gets a pre-allocated buffer that is later 
going to be reused by another request. This means that after a warmup 
phase, there will be few to no allocations per request, at least in the 
core HTTP handling code.

Mar 06 2023

ikod <igor.khasilev gmail.com> writes:

On Tuesday, 7 March 2023 at 07:55:04 UTC, Sönke Ludwig wrote:
 Am 05.03.2023 um 16:14 schrieb zoujiaqing:
 On Monday, 20 February 2023 at 08:26:23 UTC, Sönke Ludwig 
 wrote:


 In that regard, nbuff should be usable for high-level data 
 buffers just fine and although I haven't used it, it sounds 
 like a very interesting concept in terms of using range 
 interfaces with network data.

There is few simple ideas behind nbuff:

1) all accepted network data are immutable - this allow share 
buffers safely
2) usually we get data from network as a chunks of "contiguous 
and endless" stream, but actually we interested only in a small 
moving forward window of data. So it would be nice to automate 
receiving new data chunks, process them in current "window" and 
throw safely away as soon as they are processed.

Nbuff manage list of smart pointers to immutable byte buffers to 
implement this view on problem.

Mar 08 2023

D Programming

C/C++ Programming

Other

digitalmars.D - eventcore vs boost.asio performance?