## digitalmars.D.learn - Yet another parallel foreach + continue question

• seany (18/18) Jul 19 Consider :
• H. S. Teoh (21/42) Jul 19 [...]
• seany (5/13) Jul 19 Ok, therefore it means that, if at `j = 13 `i use a continue,
• H. S. Teoh (7/22) Jul 19 No, it will.
• seany (2/15) Jul 19 Even tho, the workunit specified 11 values to a single thread?
• H. S. Teoh (28/43) Jul 19 Logically speaking, the size of the work unit should not change the
• seany (7/20) Jul 19 Okey, thank you.
• Steven Schveighoffer (10/13) Jul 21 `break` should be undefined behavior (it is impossible to know which
• =?UTF-8?Q?Ali_=c3=87ehreli?= (17/35) Jul 19 Arranging the code to its equivalent may reveal the answer:
seany <seany uni-bonn.de> writes:
```Consider :

for (int i = 0; i < max_value_of_i; i++) {
foreach ( j, dummyVar;

if ( boolean_function(i,j) ) continue;
double d = expensiveFunction(i,j);
// ... stuff ...
}
}

I understand, that the parallel iterator will pick lazily values
of `j` (up to `my_workunitsize`), and execute the for loop for
those values in its own thread.

Say, values of `j` from `10`to `20` is filled where
`my_workunitsize` = 11. Say, at `j = 13` the `boolean_function`
returns true.

Will then the for loop just jump to the next value of `j = 14`
like a normal for loop? I am having a bit of difficulty to
understand this. Thank you.
```
Jul 19
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
I didn't test this, but I'm pretty sure `continue` inside a parallel
foreach loop simply terminates that iteration early; I don't think it

Basically, what .parallel does under the hood is to create N jobs (where
N is the number of items to iterate over), representing N instances of
the loop body, and assign them to M worker threads to execute. Then it
waits until all N jobs have been completed before it returns.  Which
order the worker threads will pick up the loop body instances is not
specified, and generally is not predictable from user code.

The loop body in this case is translated into a delegate that gets
passed to the task pool's .opApply method; each worker thread that picks
up a job simply invokes the delegate with the right value of the loop
variable. A `continue` translates to returning a specific magic value
from the delegate that tells .opApply that the loop body finished early.
AFAIK, the task pool does not act on this return value, i.e., the other
instances of the loop body will execute regardless.

T

```
Jul 19
seany <seany uni-bonn.de> writes:
Ok, therefore it means that, if at `j = 13 `i use a continue,
then the thread where I had `10`... `20` as values of `j`, will
only execute for `j = 10, 11, 12 ` and will not reach `14`or
later ?
```
Jul 19
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
Ok, therefore it means that, if at `j = 13 `i use a continue, then the
thread where I had `10`... `20` as values of `j`, will only execute
for `j = 10, 11, 12 ` and will not reach `14`or later ?

No, it will.

Since each iteration is running in parallel, the fact that one of them
terminated early should not affect the others.

T

```
Jul 19
seany <seany uni-bonn.de> writes:
Ok, therefore it means that, if at `j = 13 `i use a continue,
then the thread where I had `10`... `20` as values of `j`,
will only execute for `j = 10, 11, 12 ` and will not reach
`14`or later ?

No, it will.

Since each iteration is running in parallel, the fact that one
of them terminated early should not affect the others.

T

Even tho, the workunit specified 11 values to a single thread?
```
Jul 19
"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
Ok, therefore it means that, if at `j = 13 `i use a continue, then
the thread where I had `10`... `20` as values of `j`, will only
execute for `j = 10, 11, 12 ` and will not reach `14`or later ?

No, it will.

Since each iteration is running in parallel, the fact that one of
them terminated early should not affect the others.

Even tho, the workunit specified 11 values to a single thread?

Logically speaking, the size of the work unit should not change the
semantics of the loop. That's just an implementation detail that should
not affect the semantics of the overall computation.  In order to
maintain consistency, loop iterations should not affect each other
(unless they deliberately do so, e.g., read/write from a shared variable
-- but parallel foreach itself should not introduce such a dependency).

I didn't check the implementation to verify this, but I'm pretty sure
`break`, `continue`, etc., in the parallel foreach body does not change
which iteration gets run or not.

Think of it this way: when you use a parallel foreach, what you're
essentially asking for is that, logically speaking, *all* loop
iterations start in parallel (even though in actual implementation that
doesn't actually happen unless you have as many CPUs as you have
iterations). Meaning that by the time a thread gets to the `continue` in
a particular iteration, *all* of the other iterations may already have
started executing.  So it doesn't make sense for any of them to get
interrupted just because this particular iteration executes a
`continue`.  Doing otherwise would introduce all sorts of weird
inconsistent semantics that are hard (if not impossible) to reason

While I'm not 100% sure this is what the current parallel foreach
implementation actually does, I'm pretty sure that's the case. It
doesn't make sense to do it any other way.

T

Jul 19
seany <seany uni-bonn.de> writes:
Logically speaking, the size of the work unit should not change
the semantics of the loop. That's just an implementation detail
that should not affect the semantics of the overall
computation.  In order to maintain consistency, loop iterations
should not affect each other (unless they deliberately do so,
e.g., read/write from a shared variable -- but parallel foreach
itself should not introduce such a dependency).

Okey, thank you.

If you later have some time, and find out about the exact
implementation - and help me to understand it -  I would be most
grateful.

I have checked: [this
```
Jul 19
Steven Schveighoffer <schveiguy gmail.com> writes:
```On 7/19/21 10:58 PM, H. S. Teoh wrote:

I didn't check the implementation to verify this, but I'm pretty sure
`break`, `continue`, etc., in the parallel foreach body does not change
which iteration gets run or not.

`break` should be undefined behavior (it is impossible to know which
loops have already executed by that point). `continue` should be fine.

Noted in the

Breaking from a parallel foreach loop via a break, labeled break,
labeled continue, return or goto statement throws a ParallelForeachError.

I would say `continue` is ok (probably just implemented as an early
return), but all those others are going to throw an error (unrecoverable).

-Steve
```
Jul 21
=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
for (int i = 0; i < max_value_of_i; i++) {
my_workunitSize) {

if ( boolean_function(i,j) ) continue;
double d = expensiveFunction(i,j);
// ... stuff ...
}
}

Arranging the code to its equivalent may reveal the answer:

if (!boolean_function(i, j)) {
double d = expensiveFunction(i, j);
// ... stuff ...
}

We removed 'continue' and nothing changed and your question disappeared. :)

I understand, that the parallel iterator will pick lazily values of `j`
(up to `my_workunitsize`), and execute the for loop for those values in

Yes.

Say, values of `j` from `10`to `20` is filled where `my_workunitsize` =
11. Say, at `j = 13` the `boolean_function` returns true.

Will then the for loop just jump to the next value of `j = 14` like a
normal for loop?

Yes.

I am having a bit of difficulty to understand this.
Thank you.

parallel is only for performance gain. The 2 knobs that it provides are
also for performance reasons:

1) "Use just this many cores, not all"

2) "Process this many elements, not 100 (the default)" because otherwise
context switches are too expensive

Other than that, it shouldn't be any different from running the loop
regularly.

Ali
```
Jul 19