digitalmars.D.learn - How to call stop from parallel foreach

seany (17/17) Jun 24 2021 I have seen

Jerry (7/24) Jun 24 2021 Maybe I'm wrong here, but I don't think there is any way to do

seany (3/11) Jun 24 2021 The parallel() function is running a taskpool. I should be able

seany (18/30) Jun 24 2021 PS :

Bastiaan Veelo (20/26) Jun 24 2021 Yes there is, but it won’t break the `foreach`:

seany (15/20) Jun 24 2021 Okey. So consider :

Bastiaan Veelo (14/35) Jun 24 2021 You can nest multiple `foreach`, but only parallelise one like so:

Bastiaan Veelo (4/16) Jun 24 2021 Actually, I think this would be suboptimal as well, as the three

=?UTF-8?Q?Ali_=c3=87ehreli?= (17/19) Jun 24 2021 Yes. You have to create a task pool explicitly:

seany (17/37) Jun 25 2021 I tried this .

seany (3/21) Jun 25 2021 PS. line
=?UTF-8?Q?Ali_=c3=87ehreli?= (15/31) Jun 25 2021 Performance is not guaranteed depending on many factors. For example,

seany (5/28) Jun 25 2021 The code without the parallel foreach works fine. No segfault.

=?UTF-8?Q?Ali_=c3=87ehreli?= (7/8) Jun 25 2021 That's very common.

seany (2/10) Jun 25 2021 I have added MWP. did you have a chance to look at it?

jfondren (13/25) Jun 25 2021 A self-contained and complete example would help a lot, but the

seany (4/21) Jun 25 2021 This particular location does not cause segfault.

seany (4/28) Jun 25 2021 [Here is MWP](https://github.com/naturalmechanics/mwp).

jfondren (8/16) Jun 25 2021 With ldc2 this segfaults for me even if std.parallelism is removed

jfondren (16/18) Jun 25 2021 @safe:
seany (5/24) Jun 25 2021 Try : (this

seany (7/33) Jun 25 2021 Ok, i stopped the buss error and the segfault. It was indeed an

seany (3/18) Jun 25 2021 PS: this is the error message :

seany (5/27) Jun 25 2021 If i use `parallel(...)`it runs.

jfondren (5/9) Jun 25 2021 parallel() reuses a single taskPool that's only established once.

seany (4/17) Jun 25 2021 I have removed one - same problem.

jfondren (3/23) Jun 25 2021 One option is to do as parallel does:

=?UTF-8?Q?Ali_=c3=87ehreli?= (3/4) Jun 24 2021 Last time I checked, the current thread would run tasks as well.

Bastiaan Veelo (3/7) Jun 24 2021 Indeed, thanks.

seany <seany uni-bonn.de> writes:

I have seen 
[this](https://forum.dlang.org/thread/akhbvvjgeaspmjntznyk forum.dlang.org).

I can't call break form parallel foreach.

Okey, Is there a way to easily call .stop() from such a  case?

Here is a case to consider:

     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {
          auto c = myFunction0(i,j);
          auto d = myFunction1(a,b);
          auto f = myFunction2(i,b);
          auto g = myFunction3(a,j);

          if(someConditionCheck(c,d,f,g)) {
            // stop the outer foreach loop here
          }

        }
     }

Thank you

Jun 24 2021

Jerry <labuurii gmail.com> writes:

On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 I have seen 
 [this](https://forum.dlang.org/thread/akhbvvjgeaspmjntznyk forum.dlang.org).

 I can't call break form parallel foreach.

 Okey, Is there a way to easily call .stop() from such a  case?

 Here is a case to consider:

     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {
          auto c = myFunction0(i,j);
          auto d = myFunction1(a,b);
          auto f = myFunction2(i,b);
          auto g = myFunction3(a,j);

          if(someConditionCheck(c,d,f,g)) {
            // stop the outer foreach loop here
          }

        }
     }

 Thank you

Maybe I'm wrong here, but I don't think there is any way to do 
that with parallel.
What I would do is negate someConditionCheck and instead only do 
work when there is work to be done.
Obviously that may or may not be suitable.
But with parallel I don't see any way to make it happen.

Jun 24 2021

seany <seany uni-bonn.de> writes:

On Thursday, 24 June 2021 at 19:46:52 UTC, Jerry wrote:
 On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 [...]

 Maybe I'm wrong here, but I don't think there is any way to do 
 that with parallel.
 What I would do is negate someConditionCheck and instead only 
 do work when there is work to be done.
 Obviously that may or may not be suitable.
 But with parallel I don't see any way to make it happen.

The parallel() function is running a taskpool. I should be able 
to stop that in any case...

Jun 24 2021

seany <seany uni-bonn.de> writes:

On Thursday, 24 June 2021 at 20:08:06 UTC, seany wrote:
 On Thursday, 24 June 2021 at 19:46:52 UTC, Jerry wrote:
 On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 [...]

 Maybe I'm wrong here, but I don't think there is any way to do 
 that with parallel.
 What I would do is negate someConditionCheck and instead only 
 do work when there is work to be done.
 Obviously that may or may not be suitable.
 But with parallel I don't see any way to make it happen.

 The parallel() function is running a taskpool. I should be able 
 to stop that in any case...

PS :

I have done this :

     parallelContainer: while(true) {
     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {
          auto c = myFunction0(i,j);
          auto d = myFunction1(a,b);
          auto f = myFunction2(i,b);
          auto g = myFunction3(a,j);

          if(someConditionCheck(c,d,f,g)) {
            break parallelContainer;
          }

        }
     break;
     }

Is this safe? Will this cause any problem that I can't foresee 
now? Thank you

Jun 24 2021

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Thursday, 24 June 2021 at 18:23:01 UTC, seany wrote:
 I have seen 
 [this](https://forum.dlang.org/thread/akhbvvjgeaspmjntznyk forum.dlang.org).

 I can't call break form parallel foreach.

 Okey, Is there a way to easily call .stop() from such a  case?

Yes there is, but it won’t break the `foreach`:
```d
auto tp = taskPool;
foreach (i, ref e; tp.parallel(a))
{
     // …
     tp.stop;
}
```
The reason this does not work is because `stop` terminates the 
worker threads as soon as they are finished with their current 
`Task`, but no sooner. `parallel` creates the `Task`s before it 
presents a range to `foreach`, so no new `Task`s are created 
during iteration. Therefore all elements are iterated.

     outer: foreach(i, a; parallel(array_of_a)) {
        foreach(j, b; parallel(array_of_b)) {

By the way, nesting parallel `foreach` does not make much sense, 
as one level already distributes the load across all cores (but 
one). Additional parallelisation will likely just add overhead, 
and have a net negative effect.

— Bastiaan.

Jun 24 2021

seany <seany uni-bonn.de> writes:

On Thursday, 24 June 2021 at 20:33:00 UTC, Bastiaan Veelo wrote:

 By the way, nesting parallel `foreach` does not make much 
 sense, as one level already distributes the load across all 
 cores (but one). Additional parallelisation will likely just 
 add overhead, and have a net negative effect.

 — Bastiaan.

Okey. So consider :

     foreach(array_elem; parallel(an_array)) {
       dothing(array_elem);
     }


and then in `dothing()` :

     foreach(subelem; array_elem) {

       dootherthing(subelem);
     }

- Will this ALSO cause the same overhead?

Is there any way to control the number of CPU cores used in 
parallelization ?

E.g : take 3 cores for the first parallel foreach - and then for 
the second one, take 3 cores each -> so 3 + 3 * 3 = 12 cores out 
of a 16 core system? Thank you.

Jun 24 2021

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Thursday, 24 June 2021 at 20:41:40 UTC, seany wrote:
 On Thursday, 24 June 2021 at 20:33:00 UTC, Bastiaan Veelo wrote:

 By the way, nesting parallel `foreach` does not make much 
 sense, as one level already distributes the load across all 
 cores (but one). Additional parallelisation will likely just 
 add overhead, and have a net negative effect.

 — Bastiaan.

 Okey. So consider :

     foreach(array_elem; parallel(an_array)) {
       dothing(array_elem);
     }


 and then in `dothing()` :

     foreach(subelem; array_elem) {

       dootherthing(subelem);
     }

 - Will this ALSO cause the same overhead?

You can nest multiple `foreach`, but only parallelise one like so:
```d
foreach(array_elem; parallel(an_array))
       foreach(subelem; array_elem)
           dootherthing(subelem);
```
So there is no need to hide one of them in a function.

 Is there any way to control the number of CPU cores used in 
 parallelization ?

 E.g : take 3 cores for the first parallel foreach - and then 
 for the second one, take 3 cores each -> so 3 + 3 * 3 = 12 
 cores out of a 16 core system? Thank you.

There might be, by using various `TaskPool`s with a smaller 
number of work threads: 
https://dlang.org/phobos/std_parallelism.html#.TaskPool.this.2. 
But I cannot see the benefit of doing this. It will just 
distribute the same amount of work in a different way.

— Bastiaan.

Jun 24 2021

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Thursday, 24 June 2021 at 21:05:28 UTC, Bastiaan Veelo wrote:
 On Thursday, 24 June 2021 at 20:41:40 UTC, seany wrote:
 Is there any way to control the number of CPU cores used in 
 parallelization ?

 E.g : take 3 cores for the first parallel foreach - and then 
 for the second one, take 3 cores each -> so 3 + 3 * 3 = 12 
 cores out of a 16 core system? Thank you.

 There might be, by using various `TaskPool`s with a smaller 
 number of work threads: 
 https://dlang.org/phobos/std_parallelism.html#.TaskPool.this.2. 
 But I cannot see the benefit of doing this. It will just 
 distribute the same amount of work in a different way.

Actually, I think this would be suboptimal as well, as the three 
outer threads seem to do no real work.

— Bastiaan.

Jun 24 2021

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/24/21 1:41 PM, seany wrote:

 Is there any way to control the number of CPU cores used in
 parallelization ?

Yes. You have to create a task pool explicitly:

import std.parallelism;

void main() {
   enum threadCount = 2;
   auto myTaskPool = new TaskPool(threadCount);
   scope (exit) {
     myTaskPool.finish();
   }

   enum workUnitSize = 1; // Or 42 or something else. :)
   foreach (e; myTaskPool.parallel([ 1, 2, 3 ], workUnitSize)) {
     // ...
   }
}

I've touched on a few parallelism concepts at this point in a presentation:

   https://www.youtube.com/watch?v=dRORNQIB2wA&t=1332s

Ali

Jun 24 2021

seany <seany uni-bonn.de> writes:

On Thursday, 24 June 2021 at 21:19:19 UTC, Ali Çehreli wrote:
 On 6/24/21 1:41 PM, seany wrote:

 Is there any way to control the number of CPU cores used in
 parallelization ?

 Yes. You have to create a task pool explicitly:

 import std.parallelism;

 void main() {
   enum threadCount = 2;
   auto myTaskPool = new TaskPool(threadCount);
   scope (exit) {
     myTaskPool.finish();
   }

   enum workUnitSize = 1; // Or 42 or something else. :)
   foreach (e; myTaskPool.parallel([ 1, 2, 3 ], workUnitSize)) {
     // ...
   }
 }

 I've touched on a few parallelism concepts at this point in a 
 presentation:

   https://www.youtube.com/watch?v=dRORNQIB2wA&t=1332s

 Ali

I tried this .

                 int[][] pnts ;
		pnts.length = fld.length;

		enum threadCount = 2;
		auto prTaskPool = new TaskPool(threadCount);

		scope (exit) {
			prTaskPool.finish();
		}

		enum workUnitSize = 1;

		foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) {
                   //....
                 }


This is throwing random segfaults.
CPU has 2 cores, but usage is not going above 37%

Even much deeper down in program, much further down the line...
And the location of segfault is random.

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
 On Thursday, 24 June 2021 at 21:19:19 UTC, Ali Çehreli wrote:
 [...]

 I tried this .

                 int[][] pnts ;
 		pnts.length = fld.length;

 		enum threadCount = 2;
 		auto prTaskPool = new TaskPool(threadCount);

 		scope (exit) {
 			prTaskPool.finish();
 		}

 		enum workUnitSize = 1;

 		foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) {
                   //....
                 }


 This is throwing random segfaults.
 CPU has 2 cores, but usage is not going above 37%

 Even much deeper down in program, much further down the line...
 And the location of segfault is random.

PS. line 
[this](https://forum.dlang.org/thread/gomhpxzolddnodaeyvlh forum.dlang.org) I
am running into bus errors Too , sometimes way down the line after these
foreach calls are completed...

Jun 25 2021

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/25/21 6:53 AM, seany wrote:

 I tried this .

                  int[][] pnts ;
          pnts.length = fld.length;

          enum threadCount = 2;
          auto prTaskPool = new TaskPool(threadCount);

          scope (exit) {
              prTaskPool.finish();
          }

          enum workUnitSize = 1;

          foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) {
                    //....
                  }


 This is throwing random segfaults.
 CPU has 2 cores, but usage is not going above 37%

Performance is not guaranteed depending on many factors. For example, 
inserting a writeln() call in the loop would make all threads compete 
with each other for stdout. There can be many contention points some of 
which depending on your program logic. (And "Amdahl's Law" applies.)

Another reason: 1 can be a horrible value for workUnitSize. Try 100, 
1000, etc. and see whether it helps with performance.

 Even much deeper down in program, much further down the line...
 And the location of segfault is random.

Do you still have two parallel loops? Are both with explicit TaskPool 
objects? If not, I wonder whether multiple threads are using the 
convenient 'parallel' function, stepping over each others' toes. (I am 
not sure about this because perhaps it's safe to do this; never tested.)

It is possible that the segfaults are caused by your code. The code you 
showed in your original post (myFunction0() and others), they all work 
on independent data structures, right?

Ali

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 14:10:52 UTC, Ali Çehreli wrote:
 On 6/25/21 6:53 AM, seany wrote:

          [...]

 workUnitSize)) {
 [...]

 Performance is not guaranteed depending on many factors. For 
 example, inserting a writeln() call in the loop would make all 
 threads compete with each other for stdout. There can be many 
 contention points some of which depending on your program 
 logic. (And "Amdahl's Law" applies.)

 Another reason: 1 can be a horrible value for workUnitSize. Try 
 100, 1000, etc. and see whether it helps with performance.

 [...]

 line...
 [...]

 Do you still have two parallel loops? Are both with explicit 
 TaskPool objects? If not, I wonder whether multiple threads are 
 using the convenient 'parallel' function, stepping over each 
 others' toes. (I am not sure about this because perhaps it's 
 safe to do this; never tested.)

 It is possible that the segfaults are caused by your code. The 
 code you showed in your original post (myFunction0() and 
 others), they all work on independent data structures, right?

 Ali

The code without the parallel foreach works fine. No segfault.

In several instances, I do have multiple nested loops, but in 
every case. only the outer one in parallel foreach.

All of them are with explicit taskpool definition.

Jun 25 2021

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/25/21 7:21 AM, seany wrote:

 The code without the parallel foreach works fine. No segfault.

That's very common.

What I meant is, is the code written in a way to work safely in a 
parallel foreach loop? (i.e. Is the code "independent"?) (But I assume 
it is because it's been the common theme in this thread; so there must 
be something stranger going on.)

Ali

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 15:08:38 UTC, Ali Çehreli wrote:
 On 6/25/21 7:21 AM, seany wrote:

 The code without the parallel foreach works fine. No segfault.

 That's very common.

 What I meant is, is the code written in a way to work safely in 
 a parallel foreach loop? (i.e. Is the code "independent"?) (But 
 I assume it is because it's been the common theme in this 
 thread; so there must be something stranger going on.)

 Ali

I have added MWP. did you have a chance to look at it?

Jun 25 2021

jfondren <julian.fondren gmail.com> writes:

On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
 I tried this .

                 int[][] pnts ;
 		pnts.length = fld.length;

 		enum threadCount = 2;
 		auto prTaskPool = new TaskPool(threadCount);

 		scope (exit) {
 			prTaskPool.finish();
 		}

 		enum workUnitSize = 1;

 		foreach(i, fLine; prTaskPool.parallel(fld, workUnitSize)) {
                   //....
                 }

A self-contained and complete example would help a lot, but the 
likely
problem with this code is that you're accessing pnts[y][x] in the
loop, which makes the loop bodies no longer independent because 
some
of them need to first allocate an int[] to replace the zero-length
pnts[y] that you're starting with.

Consider:

```
$ rdmd --eval 'int[][] p; p.length = 5; p.map!"a.length".writeln'
[0, 0, 0, 0, 0]
```

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 14:13:14 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
[...]

 A self-contained and complete example would help a lot, but the 
 likely
 problem with this code is that you're accessing pnts[y][x] in 
 the
 loop, which makes the loop bodies no longer independent because 
 some
 of them need to first allocate an int[] to replace the 
 zero-length
 pnts[y] that you're starting with.

 Consider:

 ```
 $ rdmd --eval 'int[][] p; p.length = 5; 
 p.map!"a.length".writeln'
 [0, 0, 0, 0, 0]
 ```

This particular location does not cause segfault.
It is segfaulting down the line in a completely unrelated 
location... Wait I will try to make a MWP.

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 14:22:25 UTC, seany wrote:
 On Friday, 25 June 2021 at 14:13:14 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 13:53:17 UTC, seany wrote:
[...]

 A self-contained and complete example would help a lot, but 
 the likely
 problem with this code is that you're accessing pnts[y][x] in 
 the
 loop, which makes the loop bodies no longer independent 
 because some
 of them need to first allocate an int[] to replace the 
 zero-length
 pnts[y] that you're starting with.

 Consider:

 ```
 $ rdmd --eval 'int[][] p; p.length = 5; 
 p.map!"a.length".writeln'
 [0, 0, 0, 0, 0]
 ```

 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.

[Here is MWP](https://github.com/naturalmechanics/mwp).

Please compile with `dub build -b release --compiler=ldc2 `. Then 
to run, please use : `./tracker_ai --filename 21010014-86.ptl `

Jun 25 2021

jfondren <julian.fondren gmail.com> writes:

On Friday, 25 June 2021 at 14:44:13 UTC, seany wrote:
 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.

 [Here is MWP](https://github.com/naturalmechanics/mwp).

 Please compile with `dub build -b release --compiler=ldc2 `. 
 Then to run, please use : `./tracker_ai --filename 
 21010014-86.ptl `

With ldc2 this segfaults for me even if std.parallelism is removed
entirely. With DMD and std.parallelism removed it runs to 
completion.
With DMD and no changes it never seems to finish.

I reckon that there's some other memory error and that the 
parallelism
is unrelated.

Jun 25 2021

jfondren <julian.fondren gmail.com> writes:

On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 I reckon that there's some other memory error and that the 
 parallelism is unrelated.

 safe:

```
source/AI.d(83,23): Error: cannot take address of local `rData` 
in ` safe` function `main`
source/analysisEngine.d(560,20): Error: cannot take address of 
local `rd_flattened` in ` safe` function `add_missingPoints`
source/analysisEngine.d(344,5): Error: can only catch class 
objects derived from `Exception` in ` safe` code, not 
`core.exception.RangeError`
```

And then, about half complaints about  system dlib calls and
complaints about `void*` <-> `rawData[]*` casting

The RangeError catch likely does nothing with your compile flags.

Even if you don't intend to use  safe in the end, I'd bet that the
segfaults are due to what it's complaining about.

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 14:44:13 UTC, seany wrote:
 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.

 [Here is MWP](https://github.com/naturalmechanics/mwp).

 Please compile with `dub build -b release --compiler=ldc2 `. 
 Then to run, please use : `./tracker_ai --filename 
 21010014-86.ptl `

 With ldc2 this segfaults for me even if std.parallelism is 
 removed
 entirely. With DMD and std.parallelism removed it runs to 
 completion.
 With DMD and no changes it never seems to finish.

 I reckon that there's some other memory error and that the 
 parallelism
 is unrelated.

Try : (this 
version)[https://github.com/naturalmechanics/mwp/tree/nested-loops]

The goal is to parallelize : `calculate_avgSweepDist_pairwise` at 
line `3836`. Notice there we have 6 nested loops. Thank you.

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 15:50:37 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 14:44:13 UTC, seany wrote:
 This particular location does not cause segfault.
 It is segfaulting down the line in a completely unrelated 
 location... Wait I will try to make a MWP.

 [Here is MWP](https://github.com/naturalmechanics/mwp).

 Please compile with `dub build -b release --compiler=ldc2 `. 
 Then to run, please use : `./tracker_ai --filename 
 21010014-86.ptl `

 With ldc2 this segfaults for me even if std.parallelism is 
 removed
 entirely. With DMD and std.parallelism removed it runs to 
 completion.
 With DMD and no changes it never seems to finish.

 I reckon that there's some other memory error and that the 
 parallelism
 is unrelated.

 Try : (this 
 version)[https://github.com/naturalmechanics/mwp/tree/nested-loops]

 The goal is to parallelize : `calculate_avgSweepDist_pairwise` 
 at line `3836`. Notice there we have 6 nested loops. Thank you.

Ok, i stopped the buss error and the segfault. It was indeed an 
index that was written wrong in the flattened version .

No, I dont have the seg fault any more. But I have "error 
creating thread" - time to time.  Not always.

But, even with the taskpool, it is not spreading to multiple 
cores.

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 16:37:06 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:50:37 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 [...]

 Try : (this 
 version)[https://github.com/naturalmechanics/mwp/tree/nested-loops]

 The goal is to parallelize : `calculate_avgSweepDist_pairwise` 
 at line `3836`. Notice there we have 6 nested loops. Thank you.

 Ok, i stopped the buss error and the segfault. It was indeed an 
 index that was written wrong in the flattened version .

 No, I dont have the seg fault any more. But I have "error 
 creating thread" - time to time.  Not always.

 But, even with the taskpool, it is not spreading to multiple 
 cores.

PS: this is the error message : 
"core.thread.threadbase.ThreadError src/core/thread/threadbase.d(1219): Error
creating thread"

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 16:37:44 UTC, seany wrote:
 On Friday, 25 June 2021 at 16:37:06 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:50:37 UTC, seany wrote:
 On Friday, 25 June 2021 at 15:16:30 UTC, jfondren wrote:
 [...]

 Try : (this 
 version)[https://github.com/naturalmechanics/mwp/tree/nested-loops]

 The goal is to parallelize : 
 `calculate_avgSweepDist_pairwise` at line `3836`. Notice 
 there we have 6 nested loops. Thank you.

 Ok, i stopped the buss error and the segfault. It was indeed 
 an index that was written wrong in the flattened version .

 No, I dont have the seg fault any more. But I have "error 
 creating thread" - time to time.  Not always.

 But, even with the taskpool, it is not spreading to multiple 
 cores.

 PS: this is the error message : 
 "core.thread.threadbase.ThreadError src/core/thread/threadbase.d(1219): Error
creating thread"

If i use `parallel(...)`it runs.

If i use `prTaskPool.parallel(...`, then in the line : `auto 
prTaskPool = new TaskPool(threadCount);` it hits the error. 
Please help.

Jun 25 2021

jfondren <julian.fondren gmail.com> writes:

On Friday, 25 June 2021 at 19:17:38 UTC, seany wrote:
 If i use `parallel(...)`it runs.

 If i use `prTaskPool.parallel(...`, then in the line : `auto 
 prTaskPool = new TaskPool(threadCount);` it hits the error. 
 Please help.

parallel() reuses a single taskPool that's only established once.

Your code creates two TaskPools per a function invocation and you
call that function in a loop.

stracing your program might again reveal the error you're hitting.

Jun 25 2021

seany <seany uni-bonn.de> writes:

On Friday, 25 June 2021 at 19:30:16 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 19:17:38 UTC, seany wrote:
 If i use `parallel(...)`it runs.

 If i use `prTaskPool.parallel(...`, then in the line : `auto 
 prTaskPool = new TaskPool(threadCount);` it hits the error. 
 Please help.

 parallel() reuses a single taskPool that's only established 
 once.

 Your code creates two TaskPools per a function invocation and 
 you
 call that function in a loop.

 stracing your program might again reveal the error you're 
 hitting.

I have removed one - same problem.
Yes, I do call it in a loop. how can I create a taskpool in a 
function that itself will be called in a loop?

Jun 25 2021

jfondren <julian.fondren gmail.com> writes:

On Friday, 25 June 2021 at 19:52:23 UTC, seany wrote:
 On Friday, 25 June 2021 at 19:30:16 UTC, jfondren wrote:
 On Friday, 25 June 2021 at 19:17:38 UTC, seany wrote:
 If i use `parallel(...)`it runs.

 If i use `prTaskPool.parallel(...`, then in the line : `auto 
 prTaskPool = new TaskPool(threadCount);` it hits the error. 
 Please help.

 parallel() reuses a single taskPool that's only established 
 once.

 Your code creates two TaskPools per a function invocation and 
 you
 call that function in a loop.

 stracing your program might again reveal the error you're 
 hitting.

 I have removed one - same problem.
 Yes, I do call it in a loop. how can I create a taskpool in a 
 function that itself will be called in a loop?

One option is to do as parallel does:
https://github.com/dlang/phobos/blob/master/std/parallelism.d#L3508

Jun 25 2021

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 6/24/21 1:33 PM, Bastiaan Veelo wrote:

 distributes the load across all cores (but one).

Last time I checked, the current thread would run tasks as well.

Ali

Jun 24 2021

Bastiaan Veelo <Bastiaan Veelo.net> writes:

On Thursday, 24 June 2021 at 20:56:26 UTC, Ali Çehreli wrote:
 On 6/24/21 1:33 PM, Bastiaan Veelo wrote:

 distributes the load across all cores (but one).

 Last time I checked, the current thread would run tasks as well.

 Ali

Indeed, thanks.

— Bastiaan.

Jun 24 2021

D Programming

C/C++ Programming

Other

digitalmars.D.learn - How to call stop from parallel foreach