www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - associative array with Parallel

reply seany <seany uni-bonn.de> writes:
Consider :

     int [] ii;
     foreach(i,dummy; parallel(somearray)) {
       ii ~= somefunc(dummy);
     }


This is not safe, because all threads are accessing the same 
array and trying to add values and leading to collision.


But :

     int [] ii;
     ii.length = somearray.length;
     foreach(i,dummy; parallel(somearray)) {
       ii[i] ~= somefunc(dummy);
     }


This is safe. In this case, threads are accessing an unique 
memory location each.

But what about this :

     int [ string ] ii;
     ii.length = somearray.length;
     foreach(i,dummy; parallel(somearray)) {
       string j = generateUniqueString(i);
       ii[j] ~= somefunc(dummy);
     }


Is this also guaranteed thread safe?


In my 5 runs, I did not see any problems, but I'd like to 
confirm. Thank you.
Jul 21
next sibling parent reply jfondren <julian.fondren gmail.com> writes:
On Thursday, 22 July 2021 at 05:46:25 UTC, seany wrote:
 But what about this :

     int [ string ] ii;
     ii.length = somearray.length;
     foreach(i,dummy; parallel(somearray)) {
       string j = generateUniqueString(i);
       ii[j] ~= somefunc(dummy);
     }


 Is this also guaranteed thread safe?
No. Consider https://programming.guide/hash-tables-open-vs-closed-addressing.html In the open-addressing case, one thread may be searching the backing array while another thread is modifying it. In the closed-addressing case, one thread may be modifying a linked list while another thread is searching it.
Jul 21
next sibling parent reply frame <frame86 live.com> writes:
On Thursday, 22 July 2021 at 05:53:01 UTC, jfondren wrote:
 On Thursday, 22 July 2021 at 05:46:25 UTC, seany wrote:
 But what about this :

     int [ string ] ii;
     ii.length = somearray.length;
     foreach(i,dummy; parallel(somearray)) {
       string j = generateUniqueString(i);
       ii[j] ~= somefunc(dummy);
     }


 Is this also guaranteed thread safe?
No. Consider https://programming.guide/hash-tables-open-vs-closed-addressing.html In the open-addressing case, one thread may be searching the backing array while another thread is modifying it. In the closed-addressing case, one thread may be modifying a linked list while another thread is searching it.
This is another parallel foreach body conversion question. Isn't the compiler clever enough to put a synchronized block here?
Jul 21
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 7/21/21 11:01 PM, frame wrote:

 This is another parallel foreach body conversion question.
 Isn't the compiler clever enough to put a synchronized block here?
parallel is a *function* (not a D feature). So, the compiler might have to analyze the entire code to suspect race conditions. No, D does not have such features. But even if it did, we wouldn't want synchronized blocks in parallelization because a synchronized block would run a single thread at a time and nothing would be running in parallel anymore. Ali
Jul 21
parent frame <frame86 live.com> writes:
On Thursday, 22 July 2021 at 06:47:52 UTC, Ali Çehreli wrote:

 But even if it did, we wouldn't want synchronized blocks in 
 parallelization because a synchronized block would run a single 
 thread at a time and nothing would be running in parallel 
 anymore.
But it only affects the block, the other code could still run in parallel till this point is reached from any thread. Well, I just have assumed that the compiler does a conversion here knowing the parallel stuff. Of course there is no room for such a feature if the compiler only converts the foreach body as a delegate for the opApply() method. Thanks for clarification.
Jul 22
prev sibling parent reply seany <seany uni-bonn.de> writes:
On Thursday, 22 July 2021 at 05:53:01 UTC, jfondren wrote:

 No. Consider 
 https://programming.guide/hash-tables-open-vs-closed-addressing.html
The page says :
 A key is always stored in the bucket it's hashed to.
What if my keys are always unique?
Jul 22
parent reply jfondren <julian.fondren gmail.com> writes:
On Thursday, 22 July 2021 at 07:23:36 UTC, seany wrote:
 On Thursday, 22 July 2021 at 05:53:01 UTC, jfondren wrote:

 No. Consider 
 https://programming.guide/hash-tables-open-vs-closed-addressing.html
The page says :
 A key is always stored in the bucket it's hashed to.
What if my keys are always unique?
That has no bearing on the problem. Two of your unique keys might map to the same bucket.
Jul 22
parent reply seany <seany uni-bonn.de> writes:
On Thursday, 22 July 2021 at 07:27:52 UTC, jfondren wrote:
 On Thursday, 22 July 2021 at 07:23:36 UTC, seany wrote:
 On Thursday, 22 July 2021 at 05:53:01 UTC, jfondren wrote:

 No. Consider 
 https://programming.guide/hash-tables-open-vs-closed-addressing.html
The page says :
 A key is always stored in the bucket it's hashed to.
What if my keys are always unique?
That has no bearing on the problem. Two of your unique keys might map to the same bucket.
OK. Sorry for the bad question : what if i pregenerate every possible key, and fill the associative array where each such key contains some invalid number, say -1 ? Then in process, the parallel code can grab the specific key locations. Will that also create the same problem ?
Jul 22
parent reply jfondren <julian.fondren gmail.com> writes:
On Thursday, 22 July 2021 at 07:51:04 UTC, seany wrote:
 OK.
 Sorry for the bad question : what if i pregenerate every 
 possible key, and fill the associative array where each such 
 key contains some invalid number, say -1 ?
You mean where each value contains some invalid number, and the AA's keys are never changed during the parallel code? Yeah, that should work.
Jul 22
parent seany <seany uni-bonn.de> writes:
On Thursday, 22 July 2021 at 09:02:56 UTC, jfondren wrote:
 On Thursday, 22 July 2021 at 07:51:04 UTC, seany wrote:
 OK.
 Sorry for the bad question : what if i pregenerate every 
 possible key, and fill the associative array where each such 
 key contains some invalid number, say -1 ?
You mean where each value contains some invalid number, and the AA's keys are never changed during the parallel code? Yeah, that should work.
Yes, the keys are never changed during the parallel code execution. keys are pre-generated.
Jul 22
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/22/21 1:46 AM, seany wrote:
 Consider :
 
      int [] ii;
      foreach(i,dummy; parallel(somearray)) {
        ii ~= somefunc(dummy);
      }
 
 
 This is not safe, because all threads are accessing the same array and 
 trying to add values and leading to collision.
Correct. You must synchronize on ii.
 
 
 But :
 
      int [] ii;
      ii.length = somearray.length;
      foreach(i,dummy; parallel(somearray)) {
        ii[i] ~= somefunc(dummy);
      }
 
 
 This is safe. In this case, threads are accessing an unique memory 
 location each.
This isn't valid code, because you can't append to an integer. Though I think I know what you meant. Is it thread-safe (assuming the array elements are appendable)? I think so, but I'd have to see a working example.
 
 But what about this :
 
      int [ string ] ii;
      ii.length = somearray.length;
      foreach(i,dummy; parallel(somearray)) {
        string j = generateUniqueString(i);
        ii[j] ~= somefunc(dummy);
      }
 
 
 Is this also guaranteed thread safe?
First, this also isn't valid code. You can't set the length of an AA. But I'm assuming that length setting is really a placeholder for initialization (in your real code). Also, again, you cannot append to an integer. Second, as long as you don't modify the AA *structure*, you can parallel with it. In this case, you are generating some string, and appending to that. I don't know what your `generateUniqueString` is doing, nor do I know what's actually stored as keys in the AA as your initialization code is hidden. If every `j` is guaranteed to already exist as a key in the AA, and the code is made to be valid, then I think it is thread-safe. If any access with a key `j` is inserting a new AA bucket, it is *not* thread-safe. However, this is a tall order, and highly depends on your code. The compiler cannot help you here.
 In my 5 runs, I did not see any problems, but I'd like to confirm. Thank 
 you.
Testing 5 times is not a substitute for proving the thread safety. I have learned one thing long ago about threads and race conditions. Just don't do it. Ever. Even if you test 10000 times, and it doesn't fail, it will eventually. I've had code that hit a race condition after 2 weeks of running flat-out. Was one of the hardest things I ever had to debug. -Steve
Jul 22
parent seany <seany uni-bonn.de> writes:
On Thursday, 22 July 2021 at 16:39:45 UTC, Steven Schveighoffer 
wrote:
 On 7/22/21 1:46 AM, seany wrote:
 [...]
Correct. You must synchronize on ii.
 [...]
This isn't valid code, because you can't append to an integer. Though I think I know what you meant. Is it thread-safe (assuming the array elements are appendable)? I think so, but I'd have to see a working example. [...]
you are right. in the pseudocode, i wanted to say: `ii[i] = somefunc(dummy);`
Jul 23