www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - handling shared objects

reply Alex <sascha.orlov gmail.com> writes:
Hi all!
Can somebody explain to me, why the example below is not working 
in a way I'm expecting it to work?

My example is a little bit longer this time, however the half of 
it is taken from
https://dlang.org/library/std/concurrency/receive_only.html

´´´
import std.experimental.all;

struct D
{
	size_t d;
	static S s;
}
struct S
{
	D[] data;
}

struct Model
{
	auto ref s()
	{
		return D.s;
	}

	void run()
	{
		"I'm running".writeln;
		writeln(s.data.length);
	}
}

Model m;

void main()
{
	
	D.s.data.length = 4;
	m.run; //4

	auto childTid = spawn(&runner, thisTid);
	send(childTid, 0);
	receiveOnly!bool;
}

static void runner(Tid ownerTid)
{
	receive((size_t dummy){
		import core.thread : Thread;
				
		m.run;
         // Send a message back to the owner thread
         // indicating success.
         send(ownerTid, true);
     });
}
´´´

The idea is:
the model is something that I can declare deliberately in the 
application. And, I assumed that if it is (globally) shared, then 
so are all compartments of it, even if they are not explicitly 
part of the model.

Some problems arose:
1. Obviously, this is not the case, as the output is different, 
depending on the thread I start the model function.
2. If I declare the model object inside the main, the compiler 
aborts with the message "Aliases to mutable thread-local data not 
allowed."
3. If I mark the S instance as shared, it works. But I didn't 
intend to do this... Is this really how it meant to be?

As I'm writing the model object as well as all of its 
compartments, I can do almost everything... but what I to avoid 
is to declare the instances of compartments inside the model:
They are stored locally to their modules and the single elements 
of them have to know about the compound objects, like with the D 
and S structs shown.
Nov 26 2018
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/26/18 9:00 AM, Alex wrote:
 Hi all!
 Can somebody explain to me, why the example below is not working in a 
 way I'm expecting it to work?
 
 My example is a little bit longer this time, however the half of it is 
 taken from
 https://dlang.org/library/std/concurrency/receive_only.html
 
 ´´´
 import std.experimental.all;
 
 struct D
 {
      size_t d;
      static S s;
 }
 struct S
 {
      D[] data;
 }
 
 struct Model
 {
      auto ref s()
      {
          return D.s;
      }
 
      void run()
      {
          "I'm running".writeln;
          writeln(s.data.length);
      }
 }
 
 Model m;
 
 void main()
 {
 
      D.s.data.length = 4;
      m.run; //4
 
      auto childTid = spawn(&runner, thisTid);
      send(childTid, 0);
      receiveOnly!bool;
 }
 
 static void runner(Tid ownerTid)
 {
      receive((size_t dummy){
          import core.thread : Thread;
 
          m.run;
          // Send a message back to the owner thread
          // indicating success.
          send(ownerTid, true);
      });
 }
 ´´´
 
 The idea is:
 the model is something that I can declare deliberately in the 
 application. And, I assumed that if it is (globally) shared, then so are 
 all compartments of it, even if they are not explicitly part of the model.
 
 Some problems arose:
 1. Obviously, this is not the case, as the output is different, 
 depending on the thread I start the model function.
Yes, unless you declare the model to be shared, there is a copy made for each thread, independently managed.
 2. If I declare the model object inside the main, the compiler aborts 
 with the message "Aliases to mutable thread-local data not allowed."
Right, because you are not allowed to pass unshared data between threads.
 3. If I mark the S instance as shared, it works. But I didn't intend to 
 do this... Is this really how it meant to be?
Let's go over how the data is actually laid out: Model has NO data in it, so it doesn't really matter if it's shared or not. S has a single array of element type D's. There is no static data in S, so it has no static state (only instance state). D has a single size_t, which is thread-local, but has a static instance of S in the TYPE. There is not a copy of an S for each D, just a single copy for each THREAD. If you make this shared, it's a shared copy for all threads. This means the array inside the shared S will be shared between all threads too. What happens when you spawn a new thread is that the thread-local copy of m is created with Model.init (but it has no data, so it's not important). A thread-local copy of D.s is created with S.init (so an empty array). The reason your assignment of length in main doesn't work is because the init value is used, not the current value from the main thread. So yes, you need to make it shared to have the sub-thread see the changes, if that's what you are after.
 As I'm writing the model object as well as all of its compartments, I 
 can do almost everything... but what I to avoid is to declare the 
 instances of compartments inside the model:
 They are stored locally to their modules and the single elements of them 
 have to know about the compound objects, like with the D and S structs 
 shown.
A static member is stored per thread. If you want a global that's shared between all threads, you need to make it shared. But the result may not be what you are looking for, shared can cause difficulty if your code wasn't written to deal with it (and a lot of code isn't). -Steve
Nov 26 2018
parent reply Alex <sascha.orlov gmail.com> writes:
On Monday, 26 November 2018 at 14:28:33 UTC, Steven Schveighoffer 
wrote:
 Some problems arose:
 1. Obviously, this is not the case, as the output is 
 different, depending on the thread I start the model function.
Yes, unless you declare the model to be shared, there is a copy made for each thread, independently managed.
 2. If I declare the model object inside the main, the compiler 
 aborts with the message "Aliases to mutable thread-local data 
 not allowed."
Right, because you are not allowed to pass unshared data between threads.
 3. If I mark the S instance as shared, it works. But I didn't 
 intend to do this... Is this really how it meant to be?
Let's go over how the data is actually laid out: Model has NO data in it, so it doesn't really matter if it's shared or not. S has a single array of element type D's. There is no static data in S, so it has no static state (only instance state). D has a single size_t, which is thread-local,
especially, the size_t is local to an instance of D.
 but has a static instance of S in the TYPE.
Right, because to work properly a D instance has to know about the D's in the array of S
 There is not a copy of an S for each D, just a single copy for 
 each THREAD. If you make this shared, it's a shared copy for 
 all threads. This means the array inside the shared S will be 
 shared between all threads too.
This is the point where my headaches begin: I do not need this level of sharedness, but I don't really care.
 What happens when you spawn a new thread is that the 
 thread-local copy of m is created with Model.init (but it has 
 no data, so it's not important). A thread-local copy of D.s is 
 created with S.init (so an empty array). The reason your 
 assignment of length in main doesn't work is because the init 
 value is used, not the current value from the main thread.

 So yes, you need to make it shared to have the sub-thread see 
 the changes, if that's what you are after.

 As I'm writing the model object as well as all of its 
 compartments, I can do almost everything... but what I to 
 avoid is to declare the instances of compartments inside the 
 model:
 They are stored locally to their modules and the single 
 elements of them have to know about the compound objects, like 
 with the D and S structs shown.
A static member is stored per thread. If you want a global that's shared between all threads, you need to make it shared. But the result may not be what you are looking for, shared can cause difficulty if your code wasn't written to deal with it (and a lot of code isn't).
Well, the only reason I use multithreading is this: https://forum.dlang.org/thread/cfrtilrtbahollmazzfv forum.dlang.org So, even if my code is not really shared designed, this doesn't matter, as I wait for "the other" thread to end (or interrupt it). So, marking the model as shared is already a workaround, for being able to pass it to another thread, which I don't really need. However, now, if also all components of the model have to be marked shared, the workaround has to grow and expands over all components (?). This is the reason for this question...
 -Steve
Nov 26 2018
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/26/18 10:16 AM, Alex wrote:
 On Monday, 26 November 2018 at 14:28:33 UTC, Steven Schveighoffer wrote:
 A static member is stored per thread. If you want a global that's 
 shared between all threads, you need to make it shared. But the result 
 may not be what you are looking for, shared can cause difficulty if 
 your code wasn't written to deal with it (and a lot of code isn't).
Well, the only reason I use multithreading is this: https://forum.dlang.org/thread/cfrtilrtbahollmazzfv forum.dlang.org So, even if my code is not really shared designed, this doesn't matter, as I wait for "the other" thread to end (or interrupt it). So, marking the model as shared is already a workaround, for being able to pass it to another thread, which I don't really need. However, now, if also all components of the model have to be marked shared, the workaround has to grow and expands over all components (?). This is the reason for this question...
Well, if you want to run calculations in another thread, then send the result back to the original, you may be better off sending the state needed for the calculation to the worker thread, and receiving the result back via the messaging system. It's really hard to know the requirements with such toy examples, so maybe that's not workable for you. What it seems like you need is a way to run the calculations in a separate thread. But with multiple threads comes all the dangers of concurrency and races. So you have to be very careful about how you design this. At this point, std.concurrency does not have the ability to safely pass mutable data to another thread without it being shared. Note that if you want to do it without safety in place, you can use the Thread class in core.thread which has no requirements for data to be immutable or shared. But you have to be even more careful about how you access the data. -Steve
Nov 26 2018
parent reply Alex <sascha.orlov gmail.com> writes:
On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer 
wrote:
 Well, if you want to run calculations in another thread, then 
 send the result back to the original, you may be better off 
 sending the state needed for the calculation to the worker 
 thread, and receiving the result back via the messaging system.
How to do this, if parts of the state are statically saved in a type?
 Note that if you want to do it without safety in place, you can 
 use the Thread class in core.thread which has no requirements 
 for data to be immutable or shared. But you have to be even 
 more careful about how you access the data.
Ah... ok. But then, I will prefer to mark the appropriate parts as shared, I think...
Nov 26 2018
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 11/26/18 10:37 AM, Alex wrote:
 On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer wrote:
 Well, if you want to run calculations in another thread, then send the 
 result back to the original, you may be better off sending the state 
 needed for the calculation to the worker thread, and receiving the 
 result back via the messaging system.
How to do this, if parts of the state are statically saved in a type?
For instance, with your toy example, instead of saving the D[] as a static instance to share with all threads, use idup to make a complete copy, and then send that array directly to the new thread via spawn. When the result is done, instead of sending a bool to say it's complete, send the answer. Sending an immutable copy is the easiest way to ensure you have no races. It may be more expensive than you want to make a deep copy of something, but probably less expensive than the headache of creating a non-debuggable monster race condition.
 
 Note that if you want to do it without safety in place, you can use 
 the Thread class in core.thread which has no requirements for data to 
 be immutable or shared. But you have to be even more careful about how 
 you access the data.
Ah... ok. But then, I will prefer to mark the appropriate parts as shared, I think...
Right :) Threading is always very difficult to get right, and usually very difficult to find errors when you get it wrong. I remember working with pthreads about 20 years ago in a C++ project, and having a data race that caused a hang once every *2 weeks*. It took insane amounts of printouts and logging to figure out exactly why it happened (and the cycle was 2 weeks roughly), and the cause was (I think, not 100% sure) a place where a lock should have been but wasn't used. -Steve
Nov 26 2018
parent Alex <sascha.orlov gmail.com> writes:
On Monday, 26 November 2018 at 16:27:23 UTC, Steven Schveighoffer 
wrote:
 On 11/26/18 10:37 AM, Alex wrote:
 On Monday, 26 November 2018 at 15:26:43 UTC, Steven 
 Schveighoffer wrote:
 Well, if you want to run calculations in another thread, then 
 send the result back to the original, you may be better off 
 sending the state needed for the calculation to the worker 
 thread, and receiving the result back via the messaging 
 system.
How to do this, if parts of the state are statically saved in a type?
For instance, with your toy example, instead of saving the D[] as a static instance to share with all threads, use idup to make a complete copy, and then send that array directly to the new thread via spawn. When the result is done, instead of sending a bool to say it's complete, send the answer. Sending an immutable copy is the easiest way to ensure you have no races. It may be more expensive than you want to make a deep copy of something, but probably less expensive than the headache of creating a non-debuggable monster race condition.
Yeah... the problem is: the D[] array is stored statically not because of threads, but because every element of it has to have an access to it. So not to store it statically is the very point I want to avoid. But the idea is clear now, I think: I should delay the array expansion until the object is transferred to the other thread. Then, I expand the whole thing, (statically, as I would like to) do my calculations there and send back results, as you proposed. In this way, the object to copy will be a simple copy, because everything that would need a deep copy will be created in the proper thread, after the transfer. Thanks :)
Nov 26 2018