www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - How to call stop from parallel foreach

reply seany <seany uni-bonn.de> writes:
I have seen 
[this](https://forum.dlang.org/thread/akhbvvjgeaspmjntznyk forum.dlang.org).

I can't call break form parallel foreach.

Okey, Is there a way to easily call .stop() from such a  case?

Here is a case to consider:

     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {
          auto c = myFunction0(i,j);
          auto d = myFunction1(a,b);
          auto f = myFunction2(i,b);
          auto g = myFunction3(a,j);

          if(someConditionCheck(c,d,f,g)) {
            // stop the outer foreach loop here
          }

        }
     }

Thank you
Jun 24
next sibling parent reply Jerry <labuurii gmail.com> writes:
On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 I have seen 
 [this](https://forum.dlang.org/thread/akhbvvjgeaspmjntznyk forum.dlang.org).

 I can't call break form parallel foreach.

 Okey, Is there a way to easily call .stop() from such a  case?

 Here is a case to consider:

     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {
          auto c = myFunction0(i,j);
          auto d = myFunction1(a,b);
          auto f = myFunction2(i,b);
          auto g = myFunction3(a,j);

          if(someConditionCheck(c,d,f,g)) {
            // stop the outer foreach loop here
          }

        }
     }

 Thank you
Maybe I'm wrong here, but I don't think there is any way to do that with parallel. What I would do is negate someConditionCheck and instead only do work when there is work to be done. Obviously that may or may not be suitable. But with parallel I don't see any way to make it happen.
Jun 24
parent reply seany <seany uni-bonn.de> writes:
On Thursday, 24 June 2021 at 19:46:52 UTC, Jerry wrote:
 On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 [...]
Maybe I'm wrong here, but I don't think there is any way to do that with parallel. What I would do is negate someConditionCheck and instead only do work when there is work to be done. Obviously that may or may not be suitable. But with parallel I don't see any way to make it happen.
The parallel() function is running a taskpool. I should be able to stop that in any case...
Jun 24
parent seany <seany uni-bonn.de> writes:
On Thursday, 24 June 2021 at 20:08:06 UTC, seany wrote:
 On Thursday, 24 June 2021 at 19:46:52 UTC, Jerry wrote:
 On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 [...]
Maybe I'm wrong here, but I don't think there is any way to do that with parallel. What I would do is negate someConditionCheck and instead only do work when there is work to be done. Obviously that may or may not be suitable. But with parallel I don't see any way to make it happen.
The parallel() function is running a taskpool. I should be able to stop that in any case...
PS : I have done this : parallelContainer: while(true) { outer: foreach(i, a; parallel(array_of_a)) { foreach(j, b; parallel(array_of_b)) { auto c = myFunction0(i,j); auto d = myFunction1(a,b); auto f = myFunction2(i,b); auto g = myFunction3(a,j); if(someConditionCheck(c,d,f,g)) { break parallelContainer; } } break; } Is this safe? Will this cause any problem that I can't foresee now? Thank you
Jun 24
prev sibling parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 I have seen 
 [this](https://forum.dlang.org/thread/akhbvvjgeaspmjntznyk forum.dlang.org).

 I can't call break form parallel foreach.

 Okey, Is there a way to easily call .stop() from such a  case?
Yes there is, but it won’t break the `foreach`: ```d auto tp = taskPool; foreach (i, ref e; tp.parallel(a)) { // … tp.stop; } ``` The reason this does not work is because `stop` terminates the worker threads as soon as they are finished with their current `Task`, but no sooner. `parallel` creates the `Task`s before it presents a range to `foreach`, so no new `Task`s are created during iteration. Therefore all elements are iterated.
     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {
By the way, nesting parallel `foreach` does not make much sense, as one level already distributes the load across all cores (but one). Additional parallelisation will likely just add overhead, and have a net negative effect. — Bastiaan.
Jun 24
next sibling parent reply seany <seany uni-bonn.de> writes:
On Thursday, 24 June 2021 at 20:33:00 UTC, Bastiaan Veelo wrote:

 By the way, nesting parallel `foreach` does not make much 
 sense, as one level already distributes the load across all 
 cores (but one). Additional parallelisation will likely just 
 add overhead, and have a net negative effect.

 — Bastiaan.
Okey. So consider : foreach(array_elem; parallel(an_array)) { dothing(array_elem); } and then in `dothing()` : foreach(subelem; array_elem) { dootherthing(subelem); } - Will this ALSO cause the same overhead? Is there any way to control the number of CPU cores used in parallelization ? E.g : take 3 cores for the first parallel foreach - and then for the second one, take 3 cores each -> so 3 + 3 * 3 = 12 cores out of a 16 core system? Thank you.
Jun 24
next sibling parent reply Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Thursday, 24 June 2021 at 20:41:40 UTC, seany wrote:
 On Thursday, 24 June 2021 at 20:33:00 UTC, Bastiaan Veelo wrote:

 By the way, nesting parallel `foreach` does not make much 
 sense, as one level already distributes the load across all 
 cores (but one). Additional parallelisation will likely just 
 add overhead, and have a net negative effect.

 — Bastiaan.
Okey. So consider : foreach(array_elem; parallel(an_array)) { dothing(array_elem); } and then in `dothing()` : foreach(subelem; array_elem) { dootherthing(subelem); } - Will this ALSO cause the same overhead?
You can nest multiple `foreach`, but only parallelise one like so: ```d foreach(array_elem; parallel(an_array)) foreach(subelem; array_elem) dootherthing(subelem); ``` So there is no need to hide one of them in a function.
 Is there any way to control the number of CPU cores used in 
 parallelization ?

 E.g : take 3 cores for the first parallel foreach - and then 
 for the second one, take 3 cores each -> so 3 + 3 * 3 = 12 
 cores out of a 16 core system? Thank you.
There might be, by using various `TaskPool`s with a smaller number of work threads: https://dlang.org/phobos/std_parallelism.html#.TaskPool.this.2. But I cannot see the benefit of doing this. It will just distribute the same amount of work in a different way. — Bastiaan.
Jun 24
parent Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Thursday, 24 June 2021 at 21:05:28 UTC, Bastiaan Veelo wrote:
 On Thursday, 24 June 2021 at 20:41:40 UTC, seany wrote:
 Is there any way to control the number of CPU cores used in 
 parallelization ?

 E.g : take 3 cores for the first parallel foreach - and then 
 for the second one, take 3 cores each -> so 3 + 3 * 3 = 12 
 cores out of a 16 core system? Thank you.
There might be, by using various `TaskPool`s with a smaller number of work threads: https://dlang.org/phobos/std_parallelism.html#.TaskPool.this.2. But I cannot see the benefit of doing this. It will just distribute the same amount of work in a different way.
Actually, I think this would be suboptimal as well, as the three outer threads seem to do no real work. — Bastiaan.
Jun 24
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/24/21 1:41 PM, seany wrote:

 Is there any way to control the number of CPU cores used in
 parallelization ?
Yes. You have to create a task pool explicitly: import std.parallelism; void main() { enum threadCount = 2; auto myTaskPool = new TaskPool(threadCount); scope (exit) { myTaskPool.finish(); } enum workUnitSize = 1; // Or 42 or something else. :) foreach (e; myTaskPool.parallel([ 1, 2, 3 ], workUnitSize)) { // ... } } I've touched on a few parallelism concepts at this point in a presentation: https://www.youtube.com/watch?v=dRORNQIB2wA&t=1332s Ali
Jun 24
parent reply seany <seany uni-bonn.de> writes:
On Thursday, 24 June 2021 at 21:19:19 UTC, Ali Çehreli wrote:
 On 6/24/21 1:41 PM, seany wrote:

 Is there any way to control the number of CPU cores used in
 parallelization ?
Yes. You have to create a task pool explicitly: import std.parallelism; void main() { enum threadCount = 2; auto myTaskPool = new TaskPool(threadCount); scope (exit) { myTaskPool.finish(); } enum workUnitSize = 1; // Or 42 or something else. :) foreach (e; myTaskPool.parallel([ 1, 2, 3 ], workUnitSize)) { // ... } } I've touched on a few parallelism concepts at this point in a presentation: https://www.youtube.com/watch?v=dRORNQIB2wA&t=1332s Ali
I tried this . int[][] pnts ; pnts.length = fld.length; enum threadCount = 2; auto prTaskPool = new TaskPool(threadCount); scope (exit) { prTaskPool.finish(); } enum workUnitSize = 1; foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) { //.... } This is throwing random segfaults. CPU has 2 cores, but usage is not going above 37% Even much deeper down in program, much further down the line... And the location of segfault is random.
Jun 25
next sibling parent seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
 On Thursday, 24 June 2021 at 21:19:19 UTC, Ali Çehreli wrote:
 [...]
I tried this . int[][] pnts ; pnts.length = fld.length; enum threadCount = 2; auto prTaskPool = new TaskPool(threadCount); scope (exit) { prTaskPool.finish(); } enum workUnitSize = 1; foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) { //.... } This is throwing random segfaults. CPU has 2 cores, but usage is not going above 37% Even much deeper down in program, much further down the line... And the location of segfault is random.
PS. line [this](https://forum.dlang.org/thread/gomhpxzolddnodaeyvlh forum.dlang.org) I am running into bus errors Too , sometimes way down the line after these foreach calls are completed...
Jun 25
prev sibling next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/25/21 6:53 AM, seany wrote:

 I tried this .

                  int[][] pnts ;
          pnts.length = fld.length;

          enum threadCount = 2;
          auto prTaskPool = new TaskPool(threadCount);

          scope (exit) {
              prTaskPool.finish();
          }

          enum workUnitSize = 1;

          foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) {
                    //....
                  }


 This is throwing random segfaults.
 CPU has 2 cores, but usage is not going above 37%
Performance is not guaranteed depending on many factors. For example, inserting a writeln() call in the loop would make all threads compete with each other for stdout. There can be many contention points some of which depending on your program logic. (And "Amdahl's Law" applies.) Another reason: 1 can be a horrible value for workUnitSize. Try 100, 1000, etc. and see whether it helps with performance.
 Even much deeper down in program, much further down the line...
 And the location of segfault is random.
Do you still have two parallel loops? Are both with explicit TaskPool objects? If not, I wonder whether multiple threads are using the convenient 'parallel' function, stepping over each others' toes. (I am not sure about this because perhaps it's safe to do this; never tested.) It is possible that the segfaults are caused by your code. The code you showed in your original post (myFunction0() and others), they all work on independent data structures, right? Ali
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 14:10:52 UTC, Ali Çehreli wrote:
 On 6/25/21 6:53 AM, seany wrote:

          [...]
workUnitSize)) {
 [...]
Performance is not guaranteed depending on many factors. For example, inserting a writeln() call in the loop would make all threads compete with each other for stdout. There can be many contention points some of which depending on your program logic. (And "Amdahl's Law" applies.) Another reason: 1 can be a horrible value for workUnitSize. Try 100, 1000, etc. and see whether it helps with performance.
 [...]
line...
 [...]
Do you still have two parallel loops? Are both with explicit TaskPool objects? If not, I wonder whether multiple threads are using the convenient 'parallel' function, stepping over each others' toes. (I am not sure about this because perhaps it's safe to do this; never tested.) It is possible that the segfaults are caused by your code. The code you showed in your original post (myFunction0() and others), they all work on independent data structures, right? Ali
The code without the parallel foreach works fine. No segfault. In several instances, I do have multiple nested loops, but in every case. only the outer one in parallel foreach. All of them are with explicit taskpool definition.
Jun 25
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/25/21 7:21 AM, seany wrote:

 The code without the parallel foreach works fine. No segfault.
That's very common. What I meant is, is the code written in a way to work safely in a parallel foreach loop? (i.e. Is the code "independent"?) (But I assume it is because it's been the common theme in this thread; so there must be something stranger going on.) Ali
Jun 25
parent seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 15:08:38 UTC, Ali Çehreli wrote:
 On 6/25/21 7:21 AM, seany wrote:

 The code without the parallel foreach works fine. No segfault.
That's very common. What I meant is, is the code written in a way to work safely in a parallel foreach loop? (i.e. Is the code "independent"?) (But I assume it is because it's been the common theme in this thread; so there must be something stranger going on.) Ali
I have added MWP. did you have a chance to look at it?
Jun 25
prev sibling parent reply jfondren <julian.fondren gmail.com> writes:
On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
 I tried this .

                 int[][] pnts ;
 		pnts.length = fld.length;

 		enum threadCount = 2;
 		auto prTaskPool = new TaskPool(threadCount);

 		scope (exit) {
 			prTaskPool.finish();
 		}

 		enum workUnitSize = 1;

 		foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) {
                   //....
                 }
A self-contained and complete example would help a lot, but the likely problem with this code is that you're accessing pnts[y][x] in the loop, which makes the loop bodies no longer independent because some of them need to first allocate an int[] to replace the zero-length pnts[y] that you're starting with. Consider: ``` $ rdmd --eval 'int[][] p; p.length = 5; p.map!"a.length".writeln' [0, 0, 0, 0, 0] ```
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 14:13:14 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
[...]
A self-contained and complete example would help a lot, but the likely problem with this code is that you're accessing pnts[y][x] in the loop, which makes the loop bodies no longer independent because some of them need to first allocate an int[] to replace the zero-length pnts[y] that you're starting with. Consider: ``` $ rdmd --eval 'int[][] p; p.length = 5; p.map!"a.length".writeln' [0, 0, 0, 0, 0] ```
This particular location does not cause segfault. It is segfaulting down the line in a completely unrelated location... Wait I will try to make a MWP.
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 14:22:25 UTC, seany wrote:
 On Friday, 25 June 2021 at 14:13:14 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
[...]
A self-contained and complete example would help a lot, but the likely problem with this code is that you're accessing pnts[y][x] in the loop, which makes the loop bodies no longer independent because some of them need to first allocate an int[] to replace the zero-length pnts[y] that you're starting with. Consider: ``` $ rdmd --eval 'int[][] p; p.length = 5; p.map!"a.length".writeln' [0, 0, 0, 0, 0] ```
This particular location does not cause segfault. It is segfaulting down the line in a completely unrelated location... Wait I will try to make a MWP.
[Here is MWP](https://github.com/naturalmechanics/mwp). Please compile with `dub build -b release --compiler=ldc2 `. Then to run, please use : `./tracker_ai --filename 21010014-86.ptl `
Jun 25
parent reply jfondren <julian.fondren gmail.com> writes:
On Friday, 25 June 2021 at 14:44:13 UTC, seany wrote:
 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.
[Here is MWP](https://github.com/naturalmechanics/mwp). Please compile with `dub build -b release --compiler=ldc2 `. Then to run, please use : `./tracker_ai --filename 21010014-86.ptl `
With ldc2 this segfaults for me even if std.parallelism is removed entirely. With DMD and std.parallelism removed it runs to completion. With DMD and no changes it never seems to finish. I reckon that there's some other memory error and that the parallelism is unrelated.
Jun 25
next sibling parent jfondren <julian.fondren gmail.com> writes:
On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 I reckon that there's some other memory error and that the 
 parallelism is unrelated.
safe: ``` source/AI.d(83,23): Error: cannot take address of local `rData` in ` safe` function `main` source/analysisEngine.d(560,20): Error: cannot take address of local `rd_flattened` in ` safe` function `add_missingPoints` source/analysisEngine.d(344,5): Error: can only catch class objects derived from `Exception` in ` safe` code, not `core.exception.RangeError` ``` And then, about half complaints about system dlib calls and complaints about `void*` <-> `rawData[]*` casting The RangeError catch likely does nothing with your compile flags. Even if you don't intend to use safe in the end, I'd bet that the segfaults are due to what it's complaining about.
Jun 25
prev sibling parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 14:44:13 UTC, seany wrote:
 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.
[Here is MWP](https://github.com/naturalmechanics/mwp). Please compile with `dub build -b release --compiler=ldc2 `. Then to run, please use : `./tracker_ai --filename 21010014-86.ptl `
With ldc2 this segfaults for me even if std.parallelism is removed entirely. With DMD and std.parallelism removed it runs to completion. With DMD and no changes it never seems to finish. I reckon that there's some other memory error and that the parallelism is unrelated.
Try : (this version)[https://github.com/naturalmechanics/mwp/tree/nested-loops] The goal is to parallelize : `calculate_avgSweepDist_pairwise` at line `3836`. Notice there we have 6 nested loops. Thank you.
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 15:50:37 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 14:44:13 UTC, seany wrote:
 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.
[Here is MWP](https://github.com/naturalmechanics/mwp). Please compile with `dub build -b release --compiler=ldc2 `. Then to run, please use : `./tracker_ai --filename 21010014-86.ptl `
With ldc2 this segfaults for me even if std.parallelism is removed entirely. With DMD and std.parallelism removed it runs to completion. With DMD and no changes it never seems to finish. I reckon that there's some other memory error and that the parallelism is unrelated.
Try : (this version)[https://github.com/naturalmechanics/mwp/tree/nested-loops] The goal is to parallelize : `calculate_avgSweepDist_pairwise` at line `3836`. Notice there we have 6 nested loops. Thank you.
Ok, i stopped the buss error and the segfault. It was indeed an index that was written wrong in the flattened version . No, I dont have the seg fault any more. But I have "error creating thread" - time to time. Not always. But, even with the taskpool, it is not spreading to multiple cores.
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 16:37:06 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:50:37 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 [...]
Try : (this version)[https://github.com/naturalmechanics/mwp/tree/nested-loops] The goal is to parallelize : `calculate_avgSweepDist_pairwise` at line `3836`. Notice there we have 6 nested loops. Thank you.
Ok, i stopped the buss error and the segfault. It was indeed an index that was written wrong in the flattened version . No, I dont have the seg fault any more. But I have "error creating thread" - time to time. Not always. But, even with the taskpool, it is not spreading to multiple cores.
PS: this is the error message : "core.thread.threadbase.ThreadError src/core/thread/threadbase.d(1219): Error creating thread"
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 16:37:44 UTC, seany wrote:
 On Friday, 25 June 2021 at 16:37:06 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:50:37 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 [...]
Try : (this version)[https://github.com/naturalmechanics/mwp/tree/nested-loops] The goal is to parallelize : `calculate_avgSweepDist_pairwise` at line `3836`. Notice there we have 6 nested loops. Thank you.
Ok, i stopped the buss error and the segfault. It was indeed an index that was written wrong in the flattened version . No, I dont have the seg fault any more. But I have "error creating thread" - time to time. Not always. But, even with the taskpool, it is not spreading to multiple cores.
PS: this is the error message : "core.thread.threadbase.ThreadError src/core/thread/threadbase.d(1219): Error creating thread"
If i use `parallel(...)`it runs. If i use `prTaskPool.parallel(...`, then in the line : `auto prTaskPool = new TaskPool(threadCount);` it hits the error. Please help.
Jun 25
parent reply jfondren <julian.fondren gmail.com> writes:
On Friday, 25 June 2021 at 19:17:38 UTC, seany wrote:
 If i use `parallel(...)`it runs.

 If i use `prTaskPool.parallel(...`, then in the line : `auto 
 prTaskPool = new TaskPool(threadCount);` it hits the error. 
 Please help.
parallel() reuses a single taskPool that's only established once. Your code creates two TaskPools per a function invocation and you call that function in a loop. stracing your program might again reveal the error you're hitting.
Jun 25
parent reply seany <seany uni-bonn.de> writes:
On Friday, 25 June 2021 at 19:30:16 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 19:17:38 UTC, seany wrote:
 If i use `parallel(...)`it runs.

 If i use `prTaskPool.parallel(...`, then in the line : `auto 
 prTaskPool = new TaskPool(threadCount);` it hits the error. 
 Please help.
parallel() reuses a single taskPool that's only established once. Your code creates two TaskPools per a function invocation and you call that function in a loop. stracing your program might again reveal the error you're hitting.
I have removed one - same problem. Yes, I do call it in a loop. how can I create a taskpool in a function that itself will be called in a loop?
Jun 25
parent jfondren <julian.fondren gmail.com> writes:
On Friday, 25 June 2021 at 19:52:23 UTC, seany wrote:
 On Friday, 25 June 2021 at 19:30:16 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 19:17:38 UTC, seany wrote:
 If i use `parallel(...)`it runs.

 If i use `prTaskPool.parallel(...`, then in the line : `auto 
 prTaskPool = new TaskPool(threadCount);` it hits the error. 
 Please help.
parallel() reuses a single taskPool that's only established once. Your code creates two TaskPools per a function invocation and you call that function in a loop. stracing your program might again reveal the error you're hitting.
I have removed one - same problem. Yes, I do call it in a loop. how can I create a taskpool in a function that itself will be called in a loop?
One option is to do as parallel does: https://github.com/dlang/phobos/blob/master/std/parallelism.d#L3508
Jun 25
prev sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/24/21 1:33 PM, Bastiaan Veelo wrote:

 distributes the load across all cores (but one).
Last time I checked, the current thread would run tasks as well. Ali
Jun 24
parent Bastiaan Veelo <Bastiaan Veelo.net> writes:
On Thursday, 24 June 2021 at 20:56:26 UTC, Ali Çehreli wrote:
 On 6/24/21 1:33 PM, Bastiaan Veelo wrote:

 distributes the load across all cores (but one).
Last time I checked, the current thread would run tasks as well. Ali
Indeed, thanks. — Bastiaan.
Jun 24