www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Remember that Go vs D MQTT thing and how we wondered about dmd vs gdc?

reply "Atila Neves" <atila.neves gmail.com> writes:
Well, I found out the other day that vibe.d compiles with gdc now 
so I went back to see if it made any difference to the benchmarks 
I had.

In throughput it made none.

In the latency one it was about 5-10% faster with gdc compared to 
dmd, which is good, but it still didn't change the relative 
positions of the languages.

So that was anti-climatic. :P

Atila
Mar 06 2014
next sibling parent reply "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:
 Well, I found out the other day that vibe.d compiles with gdc 
 now so I went back to see if it made any difference to the 
 benchmarks I had.

 In throughput it made none.

 In the latency one it was about 5-10% faster with gdc compared 
 to dmd, which is good, but it still didn't change the relative 
 positions of the languages.

 So that was anti-climatic. :P

 Atila

I'm suspecting that Vibe's performance is heavily based upon the systems state i.e. hdd. Not so much on the code generation. I don't know where we can get more performance out of it. But something doesn't quite feel right.
Mar 06 2014
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/7/14, 12:45 AM, Bienlein wrote:
 If you want to give D a boost, put Go-style CSP and green threads into
 it as well. Then D will start to fly. Otherwise it will have to continue
 competing against C++ as its sole application area where it will always
 remain a niche player, because of the market dominance of C++.

Interesting you should mention that. Walter has been mulling over a possible DIP on that. Andrei
Mar 07 2014
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/8/14, 3:22 AM, Russel Winder wrote:
 Dataflow is though where "Big Data" is going. There are commercial
 offerings in the JVM space and they are making huge profits on
 licencing, simply because the frameworks work.

Do you have a couple of relevant links describing dataflow? Andrei
Mar 08 2014
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/7/14, 12:45 AM, Bienlein wrote:
 If you want to give D a boost, put Go-style CSP and green threads into
 it as well. Then D will start to fly. Otherwise it will have to continue
 competing against C++ as its sole application area where it will always
 remain a niche player, because of the market dominance of C++.

One question - doesn't Vibe.d already use green threads? Andrei
Mar 07 2014
prev sibling next sibling parent Shammah Chancellor <anonymous coward.com> writes:
On 2014-03-07 21:11:12 +0000, Bienlein said:

 What they are saying on their web site is that they are using fibers 
 and at the same time they say they are using libevent. That is 
 confusing for me. On http://vibed.org/features they write: "Instead of 
 using classic blocking I/O together with multi-threading for doing 
 parallel network and file operations, all operations are using 
 asynchronous operating system APIs. By default, >>libevent<< is used to 
 access these APIs operating system independently."
 
 Further up on the same page they write: "The approach of vibe.d is to 
 use asynchronous I/O under the hood, but at the same time make it seem 
 as if all operations were synchronous and blocking, just like ordinary 
 I/O. What makes this possible is D's support for so called >>fibers<<".

That is all correct. Libevent supplies the polling an async io. D provides the ability to do fibers. Mixed together you get a very probust, easy to program, scalable, web platform. See below.
 
 It does. Bienlein has a very vague knowledge of topics he
 comments about.

I thought the vibe.d guys would shed some light at this at the occasion, but no luck. What I don't understand is how fibers can listen to input that comes in through connections they hold on to. AFAIKS, a fiber only becomes active when it's call method is called. So who calls the call method in case a connection becomes active? That is then again a kernel thread? How does the kernel thread know something arrived through a connection? It can't do a blocking wait as the system would run out of kernel threads very quickly.

Fibers are cooperatively multitasked routines. Whenever vibe-d uses a libevent IO function, it yields it's current operation back to the event loop. When a libevent poll indicates there is data waiting, it resumes that fiber where it was left off. The vibe-d event loop is essentially the scheduler for the fibers.
Mar 07 2014
prev sibling next sibling parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig+dforum outerproduct.org> writes:
Am 07.03.2014 22:11, schrieb Bienlein:
 One question - doesn't Vibe.d already use green threads?

What they are saying on their web site is that they are using fibers and at the same time they say they are using libevent. That is confusing for me. On http://vibed.org/features they write: "Instead of using classic blocking I/O together with multi-threading for doing parallel network and file operations, all operations are using asynchronous operating system APIs. By default, >>libevent<< is used to access these APIs operating system independently." Further up on the same page they write: "The approach of vibe.d is to use asynchronous I/O under the hood, but at the same time make it seem as if all operations were synchronous and blocking, just like ordinary I/O. What makes this possible is D's support for so called >>fibers<<".
 It does. Bienlein has a very vague knowledge of topics he
 comments about.

I thought the vibe.d guys would shed some light at this at the occasion, but no luck. What I don't understand is how fibers can listen to input that comes in through connections they hold on to. AFAIKS, a fiber only becomes active when it's call method is called. So who calls the call method in case a connection becomes active? That is then again a kernel thread? How does the kernel thread know something arrived through a connection? It can't do a blocking wait as the system would run out of kernel threads very quickly.

Sorry, I've been busy with some non-programming business over the past days and didn't have a chance to reply. Making a small article about the internal workings of the task/fiber system is planned for a long time now, but there are so many items with higher priority that it unfortunately hasn't happened so far. See my reply [1] in the other thread for a rough outline.
 I think what Go and Erlang do is to use green threads (or equivalent,
 goroutines in Go) for the applications side and a kernel thread pool
 within the runtime doing "work stealing" on the green threads. This is
 more or less (ish) what the Java Fork/Join framework of Doug Lea does as
 well.

When in Go a channel runs empty the scheduler detaches the thread that served it and attaches it to a non-empty channel. In Go all this is in the language and the runtime where it can be done more efficiently than in a library. AFAIU, this is a main selling point in Go.

I actually don't see a reason why it can't be just as efficient when done as a library. Taking the example of vibe.d, fibers are currently never moved between threads (although technically, they could), but they are still stored in a free list and reused for later tasks. There is not much more overhead than a few variable assignments and the fiber context switches.
 Vert.x is caliming to be able to handle millions of active connections.

All right, as you can't have millions of threads on the JVM they must do that through some asynchronous approach (I guess Java NewIO). I read that an asynchronous solution is not as fast as one with many blocking threads as in Go or Erlang. I don't understand why. It was just claimed that this were the case.

AFAIK they use a combination of callback based asynchronous I/O (mostly for server applications) combined with a thread pool for parallelizing synchronous I/O (mostly for client type applications/tasks). So it's basically a hybrid system that still makes a lot of trade-offs between performance and comfort. Disclaimer: this statement is based only on looking at a few examples and maybe a bog post, I don't have any first hand experience with vert.x.
Mar 12 2014
prev sibling parent reply =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig+dforum outerproduct.org> writes:
Am 07.03.2014 23:29, schrieb Sean Kelly:
 On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:
 On Fri, 2014-03-07 at 16:53 +0000, Sean Kelly wrote:
 […]
 68K connections is nothing. I'll start getting interested when his
 benchmarks are 200K+.  Event-based systems in C can handle millions
 of concurrent connections if implemented properly. I'd like to
 believe vibe.d can approach this as well.

There used to be a 100k problem, i.e maintaining more than 100k active, that means regularly causing traffic, not just being dormant for a few centuries, but so many frameworks can now support that , that it has become a non-metric. I don't know if Spring, JavaEE, can handle this but on the JVM Vert.x certainly, I suspect Node.js can as well. Vert.x is caliming to be able to handle millions of active connections. I suspect it is now at the stage that the OS is the bottle neck not the language of the framework.

I think the biggest issue at very large number of connections is memory use. In fact, I don't expect even vibe.d to scale beyond a few hundred K if it allocates a fiber per connection. It would have to use a free list of fibers and make a top-level read effectively release the current fiber into the free list. Scaling at this level in C generally meant retaining little to no state per connection basically by necessity.

A free list is already used for fibers actually. Each fiber can be reused for any number of "tasks". This is also why `Fiber` as a type doesn't occur in the public API, but rather the `Task` struct, which internally points to a fiber + a task ID. But since the memory pages of a fiber's stack are allocated lazily, at least on a 64-bit OS, where address space is not an issue, you can actually scale to very high numbers with a decent amount of RAM. Certainly you don't need to have the amount of RAM that the typical dedicated server for such tasks would have. Having said that, it may be an interesting idea to offer a callback based overload of waitForData(), so that you can do something like this: listenTCP(port, &onConnection); void onConnection(TCPConnection conn) { conn.waitForData(&onData); // return (exits the task and puts the fiber // into the free list) } void onData(TCPConnection conn) { // onData gets called as a new task, so that no fiber is // occupied between the wait and the read calls conn.read(...); }
Mar 12 2014
parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig+dforum outerproduct.org> writes:
Am 18.03.2014 03:15, schrieb Marco Leise:
 Am Wed, 12 Mar 2014 10:41:11 +0100
 schrieb Sönke Ludwig <sludwig+dforum outerproduct.org>:

 But since the memory pages of a fiber's stack are allocated lazily, at
 least on a 64-bit OS, where address space is not an issue, you can
 actually scale to very high numbers with a decent amount of RAM.

This means for each fiber, you allocate e.g. 1 MiB virtual memory as a stack and let page faults allocate them from RAM on demand, right?

Exactly. Currently the stack size is set to only 64k to get a good trade-off on 32-bit systems, but I've been thinking about using a version(D_LP64) to increase this default for 64-bit.
Mar 18 2014
prev sibling next sibling parent "Atila Neves" <atila.neves gmail.com> writes:
It was already far above the competition in the throughput 
benchmark anyway. What exactly doesn't feel right to you?

On Friday, 7 March 2014 at 05:44:16 UTC, Rikki Cattermole wrote:
 On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:
 Well, I found out the other day that vibe.d compiles with gdc 
 now so I went back to see if it made any difference to the 
 benchmarks I had.

 In throughput it made none.

 In the latency one it was about 5-10% faster with gdc compared 
 to dmd, which is good, but it still didn't change the relative 
 positions of the languages.

 So that was anti-climatic. :P

 Atila

I'm suspecting that Vibe's performance is heavily based upon the systems state i.e. hdd. Not so much on the code generation. I don't know where we can get more performance out of it. But something doesn't quite feel right.

Mar 07 2014
prev sibling next sibling parent reply "Bienlein" <jeti789 web.de> writes:
On Friday, 7 March 2014 at 08:23:09 UTC, Atila Neves wrote:

 I'm suspecting that Vibe's performance is heavily based upon 
 the systems state i.e. hdd. Not so much on the code generation.
 I don't know where we can get more performance out of it. But 
 something doesn't quite feel right.


Robert Pike, the Go lead developer, some days ago published this tweet: "Just looked at a Google-internal Go server with 139K goroutines serving over 68K active network connections. Concurrency wins." In that way your MQTT benchmarks falls short with a maximum of 1k connections. You need to repeat it with 50k and 100k connections. Then Go and Erlang will rock and leave D behind. If you want to be fair with Erlang you need to make a benchmark run with 1.000k connections and more, see https://www.erlang-solutions.com/about/news/erlang-powered-whatsapp-exceeds-200-million-monthly-users I don't like Go's simplistic nature, either, but Go is not about the language. It is about making concurrency much simpler and allowing for many many threads. IMHO this is what gives Go the attention. Except for Erlang no other system/language than Go can get something similar accomplished (except Rust maybe when it is finished, but it is not clear whether it will have good built times like Go or D). If you want to give D a boost, put Go-style CSP and green threads into it as well. Then D will start to fly. Otherwise it will have to continue competing against C++ as its sole application area where it will always remain a niche player, because of the market dominance of C++.
Mar 07 2014
parent Shammah Chancellor <anonymous coward.com> writes:
On 2014-03-07 08:45:21 +0000, Bienlein said:

 On Friday, 7 March 2014 at 08:23:09 UTC, Atila Neves wrote:
 
 I'm suspecting that Vibe's performance is heavily based upon the 
 systems state i.e. hdd. Not so much on the code generation.
 I don't know where we can get more performance out of it. But something 
 doesn't quite feel right.


Robert Pike, the Go lead developer, some days ago published this tweet: "Just looked at a Google-internal Go server with 139K goroutines serving over 68K active network connections. Concurrency wins." In that way your MQTT benchmarks falls short with a maximum of 1k connections. You need to repeat it with 50k and 100k connections. Then Go and Erlang will rock and leave D behind. If you want to be fair with Erlang you need to make a benchmark run with 1.000k connections and more, see https://www.erlang-solutions.com/about/news/erlang-powered-whatsapp-exceeds-200-m llion-monthly-users I don't like Go's simplistic nature, either, but Go is not about the language. It is about making concurrency much simpler and allowing for many many threads. IMHO this is what gives Go the attention. Except for Erlang no other system/language than Go can get something similar accomplished (except Rust maybe when it is finished, but it is not clear whether it will have good built times like Go or D). If you want to give D a boost, put Go-style CSP and green threads into it as well. Then D will start to fly. Otherwise it will have to continue competing against C++ as its sole application area where it will always remain a niche player, because of the market dominance of C++.

Have you used vibe.d? It already supports in-process fibers, and much of the work that Snke is doing is being ported to phobos. I have no trouble believing that MQTT implemented on top of vibed could compete with Go or Erlang. If it can't do it right now, it's not because of a fundamental design problem, but because of bugs. -S.
Mar 07 2014
prev sibling next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:
 Well, I found out the other day that vibe.d compiles with gdc 
 now so I went back to see if it made any difference to the 
 benchmarks I had.

 In throughput it made none.

 In the latency one it was about 5-10% faster with gdc compared 
 to dmd, which is good, but it still didn't change the relative 
 positions of the languages.

 So that was anti-climatic. :P

 Atila

Have you done any profiling of your code to really get a feel on what's taking the time? If it really is io bound then there's nothing gdc can really do to make it better. Having said that, I've been getting similar results from gdc and dmd recently too, with ldc coming out as a very clear winner.
Mar 07 2014
prev sibling next sibling parent "Atila Neves" <atila.neves gmail.com> writes:
I profiled it. For throughput it was already IO bound, I'm not 
surprised gdc wasn't able to make it go faster. For the latency 
one the profiler logs were harder to grok but I can't really 
remember what was going on there anymore.

Atila

On Friday, 7 March 2014 at 09:15:15 UTC, John Colvin wrote:
 On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:
 Well, I found out the other day that vibe.d compiles with gdc 
 now so I went back to see if it made any difference to the 
 benchmarks I had.

 In throughput it made none.

 In the latency one it was about 5-10% faster with gdc compared 
 to dmd, which is good, but it still didn't change the relative 
 positions of the languages.

 So that was anti-climatic. :P

 Atila

Have you done any profiling of your code to really get a feel on what's taking the time? If it really is io bound then there's nothing gdc can really do to make it better. Having said that, I've been getting similar results from gdc and dmd recently too, with ldc coming out as a very clear winner.

Mar 07 2014
prev sibling next sibling parent "Atila Neves" <atila.neves gmail.com> writes:
I initially capped the benchmarks at 1k connections because I ran 
out of file handles and didn't feel like modifying my system.

I don't know why you think that "Then Go and Erlang will rock and 
leave D behind" when:

. I don't see any new data to back that up
. Extrapolating the existing MQTT data doesn't suggest that

If the Erlang and Go implementations were slower but seemed to 
scale better then sure, but that's not what the data show at all.

Since there's so substitute to cold hard data, I went back to the 
measurements after setting my hard limit for file handles to 150k 
and using ulimit. I only bothered with Go, D and Erlang. 
Unfortunately, the most I got away with was around 7500 
connections for loadtest. Any more than that and I got failures. 
I suspect this might be a limitation of the benchmark itself, 
which was written in Go. The failures happened for all 3 
implementations. I managed to get up to 10k connections for 
pingtest. It failed a lot though.

The results? There's probably a problem with the Erlang 
implementation but I don't know because I didn't write it. But 
its performance falls off a cliff in both benchmarks as the 
number of connections gets up to or close to 10k.

For loadtest D beats both Go and Erlang and there's no sign of Go 
scaling better (the Erlang one definitely didn't). For pingtest 
at 10k Go seems to start doing better than D, so maybe you have a 
point there.

I suspect you might have missed the point of my original blog 
post. Yes, it shows D beating Erlang and Go, and that's something 
I obviously like. But that wasn't the point I was trying to make. 
My point was that just by writing it in Go doesn't mean magical 
performance benefits because of its CSP, and that vibe.d's fibers 
would do just fine in a direct competition. The data seem to 
support that.

Atila


On Friday, 7 March 2014 at 08:45:22 UTC, Bienlein wrote:
 On Friday, 7 March 2014 at 08:23:09 UTC, Atila Neves wrote:

 I'm suspecting that Vibe's performance is heavily based upon 
 the systems state i.e. hdd. Not so much on the code 
 generation.
 I don't know where we can get more performance out of it. But 
 something doesn't quite feel right.


Robert Pike, the Go lead developer, some days ago published this tweet: "Just looked at a Google-internal Go server with 139K goroutines serving over 68K active network connections. Concurrency wins." In that way your MQTT benchmarks falls short with a maximum of 1k connections. You need to repeat it with 50k and 100k connections. Then Go and Erlang will rock and leave D behind. If you want to be fair with Erlang you need to make a benchmark run with 1.000k connections and more, see https://www.erlang-solutions.com/about/news/erlang-powered-whatsapp-exceeds-200-million-monthly-users I don't like Go's simplistic nature, either, but Go is not about the language. It is about making concurrency much simpler and allowing for many many threads. IMHO this is what gives Go the attention. Except for Erlang no other system/language than Go can get something similar accomplished (except Rust maybe when it is finished, but it is not clear whether it will have good built times like Go or D). If you want to give D a boost, put Go-style CSP and green threads into it as well. Then D will start to fly. Otherwise it will have to continue competing against C++ as its sole application area where it will always remain a niche player, because of the market dominance of C++.

Mar 07 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
Erlang is likely to have advantage in both concurrent and 
actively allocating applications because of its specialized 
garbage collector which if effectively region per Erlang process 
discarded all at once.

That won't affect throughput though, only latency and still 
nothing you can't do with D if needed.
Mar 07 2014
prev sibling next sibling parent "Rikki Cattermole" <alphaglosined gmail.com> writes:
On Friday, 7 March 2014 at 08:23:09 UTC, Atila Neves wrote:
 It was already far above the competition in the throughput 
 benchmark anyway. What exactly doesn't feel right to you?

 On Friday, 7 March 2014 at 05:44:16 UTC, Rikki Cattermole wrote:
 On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:
 Well, I found out the other day that vibe.d compiles with gdc 
 now so I went back to see if it made any difference to the 
 benchmarks I had.

 In throughput it made none.

 In the latency one it was about 5-10% faster with gdc 
 compared to dmd, which is good, but it still didn't change 
 the relative positions of the languages.

 So that was anti-climatic. :P

 Atila

I'm suspecting that Vibe's performance is heavily based upon the systems state i.e. hdd. Not so much on the code generation. I don't know where we can get more performance out of it. But something doesn't quite feel right.


Mostly related to how heavy of an effect a systems IO can have on performance i.e. hdd. Avast makes things a lot worse as well. Thanks to its file system shield. Could possibly get a performance gain by utilising Window's event manager. At Least for Windows.
Mar 07 2014
prev sibling next sibling parent "Atila Neves" <atila.neves gmail.com> writes:
I run Linux.

Atila

On Friday, 7 March 2014 at 13:31:06 UTC, Rikki Cattermole wrote:
 On Friday, 7 March 2014 at 08:23:09 UTC, Atila Neves wrote:
 It was already far above the competition in the throughput 
 benchmark anyway. What exactly doesn't feel right to you?

 On Friday, 7 March 2014 at 05:44:16 UTC, Rikki Cattermole 
 wrote:
 On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:
 Well, I found out the other day that vibe.d compiles with 
 gdc now so I went back to see if it made any difference to 
 the benchmarks I had.

 In throughput it made none.

 In the latency one it was about 5-10% faster with gdc 
 compared to dmd, which is good, but it still didn't change 
 the relative positions of the languages.

 So that was anti-climatic. :P

 Atila

I'm suspecting that Vibe's performance is heavily based upon the systems state i.e. hdd. Not so much on the code generation. I don't know where we can get more performance out of it. But something doesn't quite feel right.


Mostly related to how heavy of an effect a systems IO can have on performance i.e. hdd. Avast makes things a lot worse as well. Thanks to its file system shield. Could possibly get a performance gain by utilising Window's event manager. At Least for Windows.

Mar 07 2014
prev sibling next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 7 March 2014 at 08:45:22 UTC, Bienlein wrote:
 On Friday, 7 March 2014 at 08:23:09 UTC, Atila Neves wrote:

 I'm suspecting that Vibe's performance is heavily based upon 
 the systems state i.e. hdd. Not so much on the code 
 generation.
 I don't know where we can get more performance out of it. But 
 something doesn't quite feel right.


Robert Pike, the Go lead developer, some days ago published this tweet: "Just looked at a Google-internal Go server with 139K goroutines serving over 68K active network connections. Concurrency wins."

68K connections is nothing. I'll start getting interested when his benchmarks are 200K+. Event-based systems in C can handle millions of concurrent connections if implemented properly. I'd like to believe vibe.d can approach this as well.
 In that way your MQTT benchmarks falls short with a maximum of 
 1k connections. You need to repeat it with 50k and 100k 
 connections. Then Go and Erlang will rock and leave D behind. 
 If you want to be fair with Erlang you need to make a benchmark 
 run with 1.000k connections and more, see 
 https://www.erlang-solutions.com/about/news/erlang-powered-whatsapp-exceeds-200-million-monthly-users

Does Erlang really scale that well for network IO? I love their actor model, but their network programming model kind of stinks.
 I don't like Go's simplistic nature, either, but Go is not 
 about the language. It is about making concurrency much simpler 
 and allowing for many many threads. IMHO this is what gives Go 
 the attention. Except for Erlang no other system/language than 
 Go can get something similar accomplished (except Rust maybe 
 when it is finished, but it is not clear whether it will have 
 good built times like Go or D).

 If you want to give D a boost, put Go-style CSP and green 
 threads into it as well. Then D will start to fly. Otherwise it 
 will have to continue competing against C++ as its sole 
 application area where it will always remain a niche player, 
 because of the market dominance of C++.

vibe.d already works this way. And there's a pull request in place to make std.concurrency support green threads. I think we're really pretty close. I do need to set aside some time to start on IPC though.
Mar 07 2014
prev sibling next sibling parent Russel Winder <russel winder.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, 2014-03-07 at 12:23 +0000, Atila Neves wrote:
[=E2=80=A6]
 I suspect you might have missed the point of my original blog=20
 post. Yes, it shows D beating Erlang and Go, and that's something=20
 I obviously like. But that wasn't the point I was trying to make.=20
 My point was that just by writing it in Go doesn't mean magical=20
 performance benefits because of its CSP, and that vibe.d's fibers=20
 would do just fine in a direct competition. The data seem to=20
 support that.

That doesn't mean a CSP and dataflow implementations for D (=C3=A0 la DataRush, GPars, Go, PythonCSP, PyCSP) shouldn't attempted. Sadly I think I do not have the time to drive such an endeavour, but I wish I could contribute to it if someone else could drive. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 07 2014
prev sibling next sibling parent Russel Winder <russel winder.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, 2014-03-07 at 16:53 +0000, Sean Kelly wrote:
[=E2=80=A6]
 68K connections is nothing. I'll start getting interested when=20
 his benchmarks are 200K+.  Event-based systems in C can handle=20
 millions of concurrent connections if implemented properly.  I'd=20
 like to believe vibe.d can approach this as well.

There used to be a 100k problem, i.e maintaining more than 100k active, that means regularly causing traffic, not just being dormant for a few centuries, but so many frameworks can now support that , that it has become a non-metric. I don't know if Spring, JavaEE, can handle this but on the JVM Vert.x certainly, I suspect Node.js can as well. Vert.x is caliming to be able to handle millions of active connections. I suspect it is now at the stage that the OS is the bottle neck not the language of the framework.
 I don't like Go's simplistic nature, either, but Go is not=20
 about the language. It is about making concurrency much simpler=20
 and allowing for many many threads. IMHO this is what gives Go=20
 the attention. Except for Erlang no other system/language than=20
 Go can get something similar accomplished (except Rust maybe=20
 when it is finished, but it is not clear whether it will have=20
 good built times like Go or D).

 If you want to give D a boost, put Go-style CSP and green=20
 threads into it as well. Then D will start to fly. Otherwise it=20
 will have to continue competing against C++ as its sole=20
 application area where it will always remain a niche player,=20
 because of the market dominance of C++.

vibe.d already works this way. And there's a pull request in=20 place to make std.concurrency support green threads. I think=20 we're really pretty close. I do need to set aside some time to=20 start on IPC though.

I agree that as a stripped down C, Go sucks. But as a strongly typed language, unlike C, it is not bad. But as everyone agrees (I hope), Go's USP is CSP (*). The whole goroutines thing (and the QML capability) keeps me using Go. And to be honest the whole interfaces model and statically typed but duck typed is great fun.=20 I think what Go and Erlang do is to use green threads (or equivalent, goroutines in Go) for the applications side and a kernel thread pool within the runtime doing "work stealing" on the green threads. This is more or less (ish) what the Java Fork/Join framework of Doug Lea does as well. The upshot is that you appear to be able to have thousands of threads in your program but maybe only a few actual kernel threads doing the work.=20 (*) Rob Pike reports that he and co-workers came up with the Go model independently of Hoare's CSP, via the Newsqueak, Alef, Limbo, Go sequence. I see no reason to disbelieve him. Whatever the truth, Go is now marketed as realizing CSP, not the Hoare variant of 1978 but CSP with amendments introduced over time. It's just a pity no-one yet has a realization of =CF=80-calculus as well =E2=80=93 other than the programming= language Pict, and the Scala library PiLib.=20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 07 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:
 I suspect it is now at the stage that the OS is the bottle neck 
 not the
 language of the framework.

I think specialized operating systems devoted to single service will be the future of high load web projects similar to current realities of hard real-time services.
Mar 07 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Friday, 7 March 2014 at 19:01:34 UTC, Andrei Alexandrescu
wrote:
 On 3/7/14, 12:45 AM, Bienlein wrote:
 If you want to give D a boost, put Go-style CSP and green 
 threads into
 it as well. Then D will start to fly. Otherwise it will have 
 to continue
 competing against C++ as its sole application area where it 
 will always
 remain a niche player, because of the market dominance of C++.

One question - doesn't Vibe.d already use green threads? Andrei

It does. Bienlein has a very vague knowledge of topics he comments about.
Mar 07 2014
prev sibling next sibling parent "Graham Fawcett" <fawcett uwindsor.ca> writes:
On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:

 It's just a pity no-one yet has a
 realization of π-calculus as well – other than the programming 
 language Pict, and the Scala library PiLib.

JoCaml, an extension of Ocaml, also comes to mind. It's join-calculus, not pi-calculus, but I understand that each can be encoded in the other. Graham
Mar 07 2014
prev sibling next sibling parent Russel Winder <russel winder.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, 2014-03-07 at 19:03 +0000, Dicebot wrote:
 On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:
 I suspect it is now at the stage that the OS is the bottle neck=20
 not the
 language of the framework.

I think specialized operating systems devoted to single service=20 will be the future of high load web projects similar to current=20 realities of hard real-time services.

I guess we just have to look at Bitcoin mining to appreciate how slowly Web server technology actually moves. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 07 2014
prev sibling next sibling parent Russel Winder <russel winder.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, 2014-03-07 at 19:16 +0000, Graham Fawcett wrote:
 On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:
=20
 It's just a pity no-one yet has a
 realization of =CF=80-calculus as well =E2=80=93 other than the program=


 language Pict, and the Scala library PiLib.

JoCaml, an extension of Ocaml, also comes to mind. It's=20 join-calculus, not pi-calculus, but I understand that each can be=20 encoded in the other.

I haven't done anything with OCaml other than compiling Unison, so didn't realize they had gone this route. Re join-calculus vs =CF=80-calculus, I have no direct experience, but I suspect that it will be like actors and dataflow and CSP: each can be realized in one of the others, but if you want things to be efficient you realize them separately. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 07 2014
prev sibling next sibling parent "Bienlein" <jeti789 web.de> writes:
On Friday, 7 March 2014 at 12:23:09 UTC, Atila Neves wrote:

My point was that just by writing it in Go doesn't mean magical 
performance benefits because of its CSP, and that vibe.d's 
fibers would do just fine in a direct competition. The data seem 
to support that.

Right. I was refering to a large number of threads apparently not being a problem in Go. It was not about execution speed. This way, admittedly, I highjacked the thread a bit.
68K connections is nothing. I'll start getting interested when 
his benchmarks are 200K+.  Event-based systems in C can handle 
millions of concurrent connections if implemented properly.  I'd 
like to believe vibe.d can approach this as well.

That's good to hear. I read a blog from a company that changed from using C with libevent to Go. I searched for it now for quite a while, but couldn't find it again. From what I remember they claimed they could now handle much more connections using Go.
 One question - doesn't Vibe.d already use green threads?

What they are saying on their web site is that they are using fibers and at the same time they say they are using libevent. That is confusing for me. On http://vibed.org/features they write: "Instead of using classic blocking I/O together with multi-threading for doing parallel network and file operations, all operations are using asynchronous operating system APIs. By default, >>libevent<< is used to access these APIs operating system independently." Further up on the same page they write: "The approach of vibe.d is to use asynchronous I/O under the hood, but at the same time make it seem as if all operations were synchronous and blocking, just like ordinary I/O. What makes this possible is D's support for so called >>fibers<<".
It does. Bienlein has a very vague knowledge of topics he
comments about.

I thought the vibe.d guys would shed some light at this at the occasion, but no luck. What I don't understand is how fibers can listen to input that comes in through connections they hold on to. AFAIKS, a fiber only becomes active when it's call method is called. So who calls the call method in case a connection becomes active? That is then again a kernel thread? How does the kernel thread know something arrived through a connection? It can't do a blocking wait as the system would run out of kernel threads very quickly.
I think what Go and Erlang do is to use green threads (or 
equivalent,
goroutines in Go) for the applications side and a kernel thread 
pool
within the runtime doing "work stealing" on the green threads. 
This is
more or less (ish) what the Java Fork/Join framework of Doug Lea 
does as
well.

When in Go a channel runs empty the scheduler detaches the thread that served it and attaches it to a non-empty channel. In Go all this is in the language and the runtime where it can be done more efficiently than in a library. AFAIU, this is a main selling point in Go.
Vert.x is caliming to be able to handle millions of active 
connections.

All right, as you can't have millions of threads on the JVM they must do that through some asynchronous approach (I guess Java NewIO). I read that an asynchronous solution is not as fast as one with many blocking threads as in Go or Erlang. I don't understand why. It was just claimed that this were the case.
Mar 07 2014
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 07/03/14 10:15, John Colvin wrote:
 Having said that, I've been getting similar results from gdc and dmd recently
 too, with ldc coming out as a very clear winner.

Yup, this has been my experience for a while now too. I don't know what changed in the LLVM backend (or LDC's exploitation of its features) but LDC is clearly ahead in its ability to optimize D code. That said, the main place where DMD lags AFAICS is number-crunching. Other stuff, probably not so much; and at least in my own benchmarks of various bits of code, there are occasional surprises where some particular case seems to run faster when compiled with DMD.
Mar 07 2014
prev sibling next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:
 On Fri, 2014-03-07 at 16:53 +0000, Sean Kelly wrote:
 […]
 68K connections is nothing. I'll start getting interested when 
 his benchmarks are 200K+.  Event-based systems in C can handle 
 millions of concurrent connections if implemented properly.  
 I'd like to believe vibe.d can approach this as well.

There used to be a 100k problem, i.e maintaining more than 100k active, that means regularly causing traffic, not just being dormant for a few centuries, but so many frameworks can now support that , that it has become a non-metric. I don't know if Spring, JavaEE, can handle this but on the JVM Vert.x certainly, I suspect Node.js can as well. Vert.x is caliming to be able to handle millions of active connections. I suspect it is now at the stage that the OS is the bottle neck not the language of the framework.

I think the biggest issue at very large number of connections is memory use. In fact, I don't expect even vibe.d to scale beyond a few hundred K if it allocates a fiber per connection. It would have to use a free list of fibers and make a top-level read effectively release the current fiber into the free list. Scaling at this level in C generally meant retaining little to no state per connection basically by necessity.
Mar 07 2014
prev sibling next sibling parent "Brad Anderson" <eco gnuk.net> writes:
On Friday, 7 March 2014 at 19:03:29 UTC, Dicebot wrote:
 On Friday, 7 March 2014 at 18:58:18 UTC, Russel Winder wrote:
 I suspect it is now at the stage that the OS is the bottle 
 neck not the
 language of the framework.

I think specialized operating systems devoted to single service will be the future of high load web projects similar to current realities of hard real-time services.

I think you are right. There seems to be a lot of attention now that C10K is winding down toward addressing the next bottleneck; the OS. People are increasingly circumventing the OS and reading/writing directly from/to the network interface. http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html
Mar 07 2014
prev sibling next sibling parent "Bienlein" <jeti789 web.de> writes:
On Friday, 7 March 2014 at 18:56:05 UTC, Andrei Alexandrescu 
wrote:
 On 3/7/14, 12:45 AM, Bienlein wrote:
 If you want to give D a boost, put Go-style CSP and green 
 threads into
 it as well. Then D will start to fly. Otherwise it will have 
 to continue
 competing against C++ as its sole application area where it 
 will always
 remain a niche player, because of the market dominance of C++.

Interesting you should mention that. Walter has been mulling over a possible DIP on that.

Would be awesome if D got some kind of CSP. I used to reproduce deadlocks and race conditions for some years in a shop floor manufacturing system and fix them. From that experience I can say that you really run into much less trouble whith channels as in Go compared to using locks, sempahores, etc. You can even gradually improve your concurrent solution as you can stick to channels to which your threads are bound to. Without them threads go through everything where locks don't help with the structuring but only increase complexity. If you realize there is some mutex missing, it can be very hard to move it in place and only have little code in the mutex block. Changing concurrent code based on locks is very deadlock critical. So being defensive you put the mutex block around a lot of code rather than refactoring it to get the mutex block small to have little lock contention. With CSP you only have to fix the way you deal with some channel or introduce some other channel. CSP is truly a step ahead IMHO. -- Bienlein
Mar 07 2014
prev sibling next sibling parent Russel Winder <russel winder.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Fri, 2014-03-07 at 23:18 +0000, Bienlein wrote:
[=E2=80=A6]
 Would be awesome if D got some kind of CSP. I used to reproduce=20

 to have little lock contention. With CSP you only have to fix the=20
 way you deal with some channel or introduce some other channel.=20
 CSP is truly a step ahead IMHO.

Actors, dataflow and CSP are three different models of using processes and message passing. They are applicable in different situations. Clearly dataflow and CSP are closer to each other than either to actors. Go has chosen to focus only on CSP (sort of, see previous emails) and ignore dataflow and actors at the language level. This may be an error. In GPars, we have chosen to keep all three distinct and implemented separately. This has given a clear performance benefit over implementing one as the base and the others on top. And don't forget data parallelism, but std.parallelism already provides good stuff in that department for D =E2=80=94 though it could do with some = new love. I guess D could be said to have actors already using spawn and the message queue. Dataflow is though where "Big Data" is going. There are commercial offerings in the JVM space and they are making huge profits on licencing, simply because the frameworks work.=20 --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 08 2014
prev sibling next sibling parent "logicchains" <jonathan.t.barnard gmail.com> writes:
On Saturday, 8 March 2014 at 11:23:17 UTC, Russel Winder wrote:
 I guess D could be said to have actors already using spawn and 
 the
 message queue.

In std.concurrency, the documentation states that: "Right now, only in-process threads are supported and referenced by a more specialized handle called a Tid. It is effectively a subclass of Cid, with additional features specific to in-process messaging". Is there any timeline on when out-process threads will be supported? I think that would bring D closer to being able to achieve Erlang style concurrency.
Mar 08 2014
prev sibling next sibling parent "Atila Neves" <atila.neves gmail.com> writes:
Sure, I'd love to see CSP in D as well. I think that Go's
advantage is simplicity. If you want to try the same code on more
system threads, all you need to do is increase GOMAXPROCS. With
vibe.d it requires some work. It's not a lot of work but it isn't
as easy as with Go.

OTOH, D + vibe.d give you more control. If I want to have a
dedicated thread to do some tasks and tweak the system, I can. I
tried a bunch of different approaches to use threads to try and
make it go faster that (AFAIK) I wouldn't have been able to in
Go. None of them ended up speeding anything up, but I learned a
lot and it was fun trying.

Atila

On Friday, 7 March 2014 at 18:01:53 UTC, Russel Winder wrote:
 On Fri, 2014-03-07 at 12:23 +0000, Atila Neves wrote:
 […]
 I suspect you might have missed the point of my original blog 
 post. Yes, it shows D beating Erlang and Go, and that's 
 something I obviously like. But that wasn't the point I was 
 trying to make. My point was that just by writing it in Go 
 doesn't mean magical performance benefits because of its CSP, 
 and that vibe.d's fibers would do just fine in a direct 
 competition. The data seem to support that.

That doesn't mean a CSP and dataflow implementations for D (à la DataRush, GPars, Go, PythonCSP, PyCSP) shouldn't attempted. Sadly I think I do not have the time to drive such an endeavour, but I wish I could contribute to it if someone else could drive.

Mar 08 2014
prev sibling next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Saturday, 8 March 2014 at 12:13:07 UTC, logicchains wrote:
 On Saturday, 8 March 2014 at 11:23:17 UTC, Russel Winder wrote:
 I guess D could be said to have actors already using spawn and 
 the
 message queue.

In std.concurrency, the documentation states that: "Right now, only in-process threads are supported and referenced by a more specialized handle called a Tid. It is effectively a subclass of Cid, with additional features specific to in-process messaging". Is there any timeline on when out-process threads will be supported? I think that would bring D closer to being able to achieve Erlang style concurrency.

There's already a pull request in place to support green threads. If you mean IPC, we really need serialization first, and it would be nice to have a decent network API as well. But I've been meaning to sort out a prototype anyway. Tid will remain the reference to a thread regardless of which process it lives in, and I'll be adding a Node type.
Mar 08 2014
prev sibling next sibling parent Russel Winder <russel winder.org.uk> writes:
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Sat, 2014-03-08 at 08:53 -0800, Andrei Alexandrescu wrote:
 On 3/8/14, 3:22 AM, Russel Winder wrote:
 Dataflow is though where "Big Data" is going. There are commercial
 offerings in the JVM space and they are making huge profits on
 licencing, simply because the frameworks work.

Do you have a couple of relevant links describing dataflow?

First and foremost we have to distinguish dataflow software architectures from dataflow computers. The latter were an alternate hardware architecture that failed to gain traction, but there is an awful lot of literature out there on it. So just searching the Web is likely to give an lot of that especially in the period 1980 to 1995. The dataflow software architectures are modelled directly on the structural concepts of dataflow hardware and so the terminology is exactly the same. However whereas an operator in hardware mean add, multiply, etc. in a software architecture it just means some sequential computation that requires certain inputs and delivers some outputs. The computation must be a process, so effectively a function with no free variables. The GPars version of this is at: http://gpars.codehaus.org/Dataflow http://gpars.org/guide/guide/dataflow.html GPars needs some more work, but I haven't had chance to focus on it recently. This introduces all the cute jargon: http://www.cs.colostate.edu/cameron/dataflow.html Wikipedia has this page: http://en.wikipedia.org/wiki/Dataflow_programming but it is clearly in need of some sub-editing. Hopefully this does as a start. I can try hunt up some other things if that would help. The commercial offering I know something of is called DataRush, it's a product from a subgroup in the Pervasive group for the JVM (and optionally Hadoop): http://en.wikipedia.org/wiki/DataRush_Technology I played with this in 2008 before it was formally released, and on and off since. GPars dataflow should compete with this but they are a company with resources, and GPars has two fairly non-active (due to work commitments) volunteer developers. We had been hoping the fact that GPars is core Groovy technology required for Grails and allt he other Gr8 technology, that people would step up. However the very concept of a concurrency and parallelism framework seems to frighten off even some of the best programmers. --=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel winder.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
Mar 08 2014
prev sibling next sibling parent "Bienlein" <jeti789 web.de> writes:
On Wednesday, 12 March 2014 at 09:26:28 UTC, Sönke Ludwig wrote:

 I actually don't see a reason why it can't be just as efficient 
 when done as a library. Taking the example of vibe.d, fibers 
 are currently never moved between threads (although 
 technically, they could), but they are still stored in a free 
 list and reused for later tasks.

I believe several kernel threads are in the play to call fibers. Then the free list must be synchronized which can make a difference on a heavy loaded system at the end of the day. HawtDispatch (http://hawtdispatch.fusesource.org) applies some tricks to reduce synchronization on its free lists for that reason. But I honestly don't have a clue how that exactly works.
Mar 12 2014
prev sibling next sibling parent "Etienne" <etcimon gmail.com> writes:
On Wednesday, 12 March 2014 at 12:10:04 UTC, Bienlein wrote:
 On Wednesday, 12 March 2014 at 09:26:28 UTC, Sönke Ludwig wrote:

 I actually don't see a reason why it can't be just as 
 efficient when done as a library. Taking the example of 
 vibe.d, fibers are currently never moved between threads 
 (although technically, they could), but they are still stored 
 in a free list and reused for later tasks.

I believe several kernel threads are in the play to call fibers. Then the free list must be synchronized which can make a difference on a heavy loaded system at the end of the day. HawtDispatch (http://hawtdispatch.fusesource.org) applies some tricks to reduce synchronization on its free lists for that reason. But I honestly don't have a clue how that exactly works.

Bypassing the kernel could be more efficient for fibers if it were possible, and using thread affinity it could remove some interruption by setting the maxcpus option in the kernel. The alternative to locking via kernel is queuing using the freeway overpass method described here: http://blog.erratasec.com/2013/02/multi-core-scaling-its-not-multi.html I think HawtDispatch may be using queues to fit into this synchronization method. Snort is also a good example of mostly lock-less multi-core by using "memory mapped regions" I'm also very interested in optimizing fibers further as it would give D excellence where it already does great
Mar 12 2014
prev sibling next sibling parent "Etienne" <etcimon gmail.com> writes:
On Wednesday, 12 March 2014 at 15:11:45 UTC, Etienne wrote:
 On Wednesday, 12 March 2014 at 12:10:04 UTC, Bienlein wrote:
 On Wednesday, 12 March 2014 at 09:26:28 UTC, Sönke Ludwig 
 wrote:

 I actually don't see a reason why it can't be just as 
 efficient when done as a library. Taking the example of 
 vibe.d, fibers are currently never moved between threads 
 (although technically, they could), but they are still stored 
 in a free list and reused for later tasks.

I believe several kernel threads are in the play to call fibers. Then the free list must be synchronized which can make a difference on a heavy loaded system at the end of the day. HawtDispatch (http://hawtdispatch.fusesource.org) applies some tricks to reduce synchronization on its free lists for that reason. But I honestly don't have a clue how that exactly works.

Bypassing the kernel could be more efficient for fibers if it were possible, and using thread affinity it could remove some interruption by setting the maxcpus option in the kernel. The alternative to locking via kernel is queuing using the freeway overpass method described here: http://blog.erratasec.com/2013/02/multi-core-scaling-its-not-multi.html I think HawtDispatch may be using queues to fit into this synchronization method. Snort is also a good example of mostly lock-less multi-core by using "memory mapped regions" I'm also very interested in optimizing fibers further as it would give D excellence where it already does great

I think this article puts it well. Bypassing the kernel for fibers should be a long-term plan :) http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html
Mar 12 2014
prev sibling next sibling parent Iain Buclaw <ibuclaw gdcproject.org> writes:
On 12 March 2014 18:05, Etienne <etcimon gmail.com> wrote:
 On Wednesday, 12 March 2014 at 15:11:45 UTC, Etienne wrote:
 On Wednesday, 12 March 2014 at 12:10:04 UTC, Bienlein wrote:
 On Wednesday, 12 March 2014 at 09:26:28 UTC, Snke Ludwig wrote:

 I actually don't see a reason why it can't be just as efficient when
 done as a library. Taking the example of vibe.d, fibers are currently never
 moved between threads (although technically, they could), but they are still
 stored in a free list and reused for later tasks.

I believe several kernel threads are in the play to call fibers. Then the free list must be synchronized which can make a difference on a heavy loaded system at the end of the day. HawtDispatch (http://hawtdispatch.fusesource.org) applies some tricks to reduce synchronization on its free lists for that reason. But I honestly don't have a clue how that exactly works.

Bypassing the kernel could be more efficient for fibers if it were possible, and using thread affinity it could remove some interruption by setting the maxcpus option in the kernel. The alternative to locking via kernel is queuing using the freeway overpass method described here: http://blog.erratasec.com/2013/02/multi-core-scaling-its-not-multi.html I think HawtDispatch may be using queues to fit into this synchronization method. Snort is also a good example of mostly lock-less multi-core by using "memory mapped regions" I'm also very interested in optimizing fibers further as it would give D excellence where it already does great

I think this article puts it well. Bypassing the kernel for fibers should be a long-term plan :)

Not just fibers, but the entire synchronisation stack - which is currently just a wrap around pthreads/winthreads.
Mar 12 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 12 March 2014 at 18:05:38 UTC, Etienne wrote:
 I think this article puts it well. Bypassing the kernel for 
 fibers should be a long-term plan :)

 http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html

I have seen one real-world project where it was done. Point is not about specifically fibers though but scheduling as a whole - when all resources of the system are supposed to be devoted to a single service, general-purpose OS scheduling creates problems as it is intended for universal multi-tasking.
Mar 12 2014
prev sibling next sibling parent "Etienne" <etcimon gmail.com> writes:
On Thursday, 13 March 2014 at 06:49:35 UTC, Dicebot wrote:
 On Wednesday, 12 March 2014 at 18:05:38 UTC, Etienne wrote:
 I think this article puts it well. Bypassing the kernel for 
 fibers should be a long-term plan :)

 http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html

I have seen one real-world project where it was done. Point is not about specifically fibers though but scheduling as a whole - when all resources of the system are supposed to be devoted to a single service, general-purpose OS scheduling creates problems as it is intended for universal multi-tasking.

I know it would be breaking for other services on the computer assuming it's a desktop, but dedicated servers or embedded devices can make great use of such a feature. I'm sure this implementation could be done without restricting everything to it, especially with functional programming as we have it in D. I assume a demonstrated ten-fold increase in performance by-passing kernel is a radical justification for this.
Mar 13 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Thursday, 13 March 2014 at 18:49:42 UTC, Etienne wrote:
 On Thursday, 13 March 2014 at 06:49:35 UTC, Dicebot wrote:
 On Wednesday, 12 March 2014 at 18:05:38 UTC, Etienne wrote:
 I think this article puts it well. Bypassing the kernel for 
 fibers should be a long-term plan :)

 http://highscalability.com/blog/2013/5/13/the-secret-to-10-million-concurrent-connections-the-kernel-i.html

I have seen one real-world project where it was done. Point is not about specifically fibers though but scheduling as a whole - when all resources of the system are supposed to be devoted to a single service, general-purpose OS scheduling creates problems as it is intended for universal multi-tasking.

I know it would be breaking for other services on the computer assuming it's a desktop, but dedicated servers or embedded devices can make great use of such a feature. I'm sure this implementation could be done without restricting everything to it, especially with functional programming as we have it in D. I assume a demonstrated ten-fold increase in performance by-passing kernel is a radical justification for this.

In project I have mentioned it was taken to an extreme measure, eliminating kernel completely and sticking to barebone executable for all traffic processing (with customized linux nodes for management tasks). Performance achieved was very impressive (scale of hundreds of Gbps of throughput and millions of simultaneous TCP/UDP flows). Using D for such task has similar issues and solutions as using D for embedded (Adam, I am looking at your DConf talk!)
Mar 14 2014
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Friday, 14 March 2014 at 17:44:36 UTC, Dicebot wrote:
 Using D for such task has similar issues and solutions as using 
 D for embedded (Adam, I am looking at your DConf talk!)

Hmm, I doubt I'll say anything you don't already know though... Right now, I'm thinking the topics will be along the lines of: *) Smashing druntime then building it back up to see what does what, so we'll look at TypeInfo, _d????() functions, exception handling, class implementation, etc. *) Probably my pet topic of RTInfo just because we can *) naked asm for interrupt handlers (i just think it is cool that you can do it all right in D without hacking dmd itself). I wrote a keyboard handler a couple days ago, nothing fancy but it shows a nice interactive result. *) a few ABI things and notes about how some constructs work, like scope(exit) on the assembly language level *) Memory-mapped hardware and struct packing (surely nothing new to anyone who's done low level code before.) *) And I actually want to bring the garbage collector in too (*gasp!*). It might be bare metal, but it is still overpowered PC hardware, we might as well play with the resources. But I wasn't planning on even trying to do anything like a network stack, or even getting into particularly fancy D code. tbh my audience is more the reddit crowd that says "D sucks. not real systems level language." just to say "no u wrong" while saying some things D enthusiasts might find interesting than to really expand the minds of embedded D developers or anything like that; hell, odds are you know (a lot) more than me on that anyway.
Mar 14 2014
prev sibling next sibling parent reply "Bienlein" <jeti789 web.de> writes:
On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:

There is a thread now on the Go user forum about GoF design 
patterns in Go: 
https://groups.google.com/forum/?hl=de#!topic/golang-nuts/3fOIZ1VLn1o 
Reading the comments by Robert Pike (the Go lead developer) is 
insightful. Here is one of them:

"A concrete example: The Visitor Pattern.

This is a clever, subtle pattern that uses subtype inheritance to
implement a type switch.

Go has type switches, and therefore no need for the Visitor 
Pattern."

With type switches he means a case switch on types, see 
http://golang.org/doc/effective_go.html#type_switch

In other words, Go and OOP: Abandon all Hope! From my side the 
"Go vs D MQTT thing" is closed. Go will never develop into any 
thing than C in a modern disguise.

Maybe I now hi-jacked the thread another time. Sorry, but 
couldn't resist. At least I did resist to post a reply in that 
thread on the Go user forum. I think it would be plain useless ...
Mar 17 2014
next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 17.03.2014 17:16, schrieb Bienlein:
 On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:

 There is a thread now on the Go user forum about GoF design patterns in
 Go:
 https://groups.google.com/forum/?hl=de#!topic/golang-nuts/3fOIZ1VLn1o
 Reading the comments by Robert Pike (the Go lead developer) is
 insightful. Here is one of them:

 "A concrete example: The Visitor Pattern.

 This is a clever, subtle pattern that uses subtype inheritance to
 implement a type switch.

 Go has type switches, and therefore no need for the Visitor Pattern."

 With type switches he means a case switch on types, see
 http://golang.org/doc/effective_go.html#type_switch

 In other words, Go and OOP: Abandon all Hope! From my side the "Go vs D
 MQTT thing" is closed. Go will never develop into any thing than C in a
 modern disguise.

 Maybe I now hi-jacked the thread another time. Sorry, but couldn't
 resist. At least I did resist to post a reply in that thread on the Go
 user forum. I think it would be plain useless ...

That is no wonder. If you search the web for references, you will find that Rob Pike very much dislikes OOP. When I jumped into Go as of the language's announcement, was due to the language influence of Oberon. However with time, I came to realize my time is better spent with other language communities that enjoy modern features instead of a remake of Limbo. I still check gonuts, every now and then, though. Just don't bother posting anything. -- Paulo
Mar 17 2014
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 17.03.2014 22:24, schrieb Bienlein:
 On Monday, 17 March 2014 at 17:02:06 UTC, Paulo Pinto wrote:
 That is no wonder.

 If you search the web for references, you will find that Rob Pike very
 much dislikes OOP.

All right, but what is then the solution to encapsulate things? A type switch breaks encapsulation: If you change some inner works of component A you might have to extend the type switch in Component B. I understand the argument that dynamic binding is a high price to achieve this, but a type switch as in Go that simply breaks encapsulation is not very convincing.
 When I jumped into Go as of the language's announcement, was due to
 the language influence of Oberon.

Do you have some affiliation with the ETHZ? Oberon didn't spread much outside of it. I played with Oberon many years ago and I also recognized similarities of it in Go. Just read about it again to recap and it was striking to see how much the Oberon WITH statement resembles a Go type switch. I guess Niklaus Wirth would like Go ...

A spiritual affiliation if you will. I learned Pascal via Turbo Pascal before I got to learn C and it spoiled me never to enjoy pure C, although I like C++. A few years later when I discovered I could not use Turbo Pascal on UNIX, did I realize how basic plain ISO Pascal was. The improved ISO Extended Pascal was being ignored as Pascal compiler vendors tried to be compatible with Turbo Pascal. This was around the early 90's, when I started to interest myself for language design, which meant trying to learn as much as possible from all sources of information. Mostly books and OOPSLA papers, not much Internet on those days. The university library had lots of cool books, including many about Modula-2 and Oberon. So given my appreciation for Wirth's work I devoured those books and discovered in addition to Turbo Pascal a few more languages that could be used for systems programming. Around this time ETHZ started to support using standard PCs in addition to the Ceres hardware. So I got to install it on my computer. I was playing with the idea of creating a compiler for Oberon in GNU/Linux, which never came to be for a few reasons, although I did write an initial lexer and grammar for it. https://github.com/pjmlp/Oberon-2-FrontendTools You might find a few posts from me in comp.compilers archives from those days. The system impressed me for trying to provide a similar experience to Smalltalk, which I already knew and showing me that having a full blown OS done in a GC enabled systems programming language was possible. Since then, I have tracked Wirth's work, collecting all his publications and books. I also had the pleasure to be with him when CERN organized an Oberon day back in 2004, when I was still there. -- Paulo
Mar 17 2014
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/17/14, 9:16 AM, Bienlein wrote:
 On Thursday, 6 March 2014 at 17:17:12 UTC, Atila Neves wrote:

 There is a thread now on the Go user forum about GoF design patterns in
 Go:
 https://groups.google.com/forum/?hl=de#!topic/golang-nuts/3fOIZ1VLn1o
 Reading the comments by Robert Pike (the Go lead developer) is
 insightful. Here is one of them:

 "A concrete example: The Visitor Pattern.

 This is a clever, subtle pattern that uses subtype inheritance to
 implement a type switch.

 Go has type switches, and therefore no need for the Visitor Pattern."

 With type switches he means a case switch on types, see
 http://golang.org/doc/effective_go.html#type_switch

 In other words, Go and OOP: Abandon all Hope! From my side the "Go vs D
 MQTT thing" is closed. Go will never develop into any thing than C in a
 modern disguise.

 Maybe I now hi-jacked the thread another time. Sorry, but couldn't
 resist. At least I did resist to post a reply in that thread on the Go
 user forum. I think it would be plain useless ...

That's fine - the man doesn't like OOP and that influences the design of his language. I also suspect he's not conversant with the various modularity-related aspects of Visitor, given the glibness of the answer. And that all is fine. Walter and I also have various lacuna, and that does influence the design of D. The same goes about virtually all programming languages. Andrei
Mar 17 2014
prev sibling next sibling parent "Bienlein" <jeti789 web.de> writes:
On Monday, 17 March 2014 at 17:02:06 UTC, Paulo Pinto wrote:
 That is no wonder.

 If you search the web for references, you will find that Rob 
 Pike very much dislikes OOP.

All right, but what is then the solution to encapsulate things? A type switch breaks encapsulation: If you change some inner works of component A you might have to extend the type switch in Component B. I understand the argument that dynamic binding is a high price to achieve this, but a type switch as in Go that simply breaks encapsulation is not very convincing.
 When I jumped into Go as of the language's announcement, was 
 due to the language influence of Oberon.

Do you have some affiliation with the ETHZ? Oberon didn't spread much outside of it. I played with Oberon many years ago and I also recognized similarities of it in Go. Just read about it again to recap and it was striking to see how much the Oberon WITH statement resembles a Go type switch. I guess Niklaus Wirth would like Go ...
Mar 17 2014
prev sibling next sibling parent "Bienlein" <jeti789 web.de> writes:
On Monday, 17 March 2014 at 20:39:21 UTC, Andrei Alexandrescu 
wrote:

 That's fine - the man doesn't like OOP and that influences the 
 design of his language. I also suspect he's not conversant with 
 the various modularity-related aspects of Visitor, given the 
 glibness of the answer.

Yeah, it's usually the story about Dr.Johnson's dog ... ;-).
Mar 17 2014
prev sibling next sibling parent "Sean Kelly" <sean invisibleduck.org> writes:
On Monday, 17 March 2014 at 21:24:53 UTC, Bienlein wrote:
 On Monday, 17 March 2014 at 17:02:06 UTC, Paulo Pinto wrote:
 That is no wonder.

 If you search the web for references, you will find that Rob 
 Pike very much dislikes OOP.

All right, but what is then the solution to encapsulate things?

To do it all manually with function variables, of course, just like in C.
Mar 17 2014
prev sibling parent Marco Leise <Marco.Leise gmx.de> writes:
Am Wed, 12 Mar 2014 10:41:11 +0100
schrieb S=C3=B6nke Ludwig <sludwig+dforum outerproduct.org>:

 But since the memory pages of a fiber's stack are allocated lazily, at=20
 least on a 64-bit OS, where address space is not an issue, you can=20
 actually scale to very high numbers with a decent amount of RAM.=20

This means for each fiber, you allocate e.g. 1 MiB virtual memory as a stack and let page faults allocate them from RAM on demand, right? --=20 Marco
Mar 17 2014