digitalmars.D.learn - handling shared objects

Alex (70/70) Nov 26 2018 Hi all!

Steven Schveighoffer (26/105) Nov 26 2018 Yes, unless you declare the model to be shared, there is a copy made for...

Alex (16/60) Nov 26 2018 especially, the size_t is local to an instance of D.

Steven Schveighoffer (17/34) Nov 26 2018 Well, if you want to run calculations in another thread, then send the

Alex (6/14) Nov 26 2018 How to do this, if parts of the state are statically saved in a

Steven Schveighoffer (19/36) Nov 26 2018 For instance, with your toy example, instead of saving the D[] as a

Alex (14/35) Nov 26 2018 Yeah... the problem is:

Alex <sascha.orlov gmail.com> writes:

Hi all!
Can somebody explain to me, why the example below is not working 
in a way I'm expecting it to work?

My example is a little bit longer this time, however the half of 
it is taken from
https://dlang.org/library/std/concurrency/receive_only.html

´´´
import std.experimental.all;

struct D
{
	size_t d;
	static S s;
}
struct S
{
	D[] data;
}

struct Model
{
	auto ref s()
	{
		return D.s;
	}

	void run()
	{
		"I'm running".writeln;
		writeln(s.data.length);
	}
}

Model m;

void main()
{
	
	D.s.data.length = 4;
	m.run; //4

	auto childTid = spawn(&runner, thisTid);
	send(childTid, 0);
	receiveOnly!bool;
}

static void runner(Tid ownerTid)
{
	receive((size_t dummy){
		import core.thread : Thread;
				
		m.run;
         // Send a message back to the owner thread
         // indicating success.
         send(ownerTid, true);
     });
}
´´´

The idea is:
the model is something that I can declare deliberately in the 
application. And, I assumed that if it is (globally) shared, then 
so are all compartments of it, even if they are not explicitly 
part of the model.

Some problems arose:
1. Obviously, this is not the case, as the output is different, 
depending on the thread I start the model function.
2. If I declare the model object inside the main, the compiler 
aborts with the message "Aliases to mutable thread-local data not 
allowed."
3. If I mark the S instance as shared, it works. But I didn't 
intend to do this... Is this really how it meant to be?

As I'm writing the model object as well as all of its 
compartments, I can do almost everything... but what I to avoid 
is to declare the instances of compartments inside the model:
They are stored locally to their modules and the single elements 
of them have to know about the compound objects, like with the D 
and S structs shown.

Nov 26 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/26/18 9:00 AM, Alex wrote:
 Hi all!
 Can somebody explain to me, why the example below is not working in a 
 way I'm expecting it to work?
 
 My example is a little bit longer this time, however the half of it is 
 taken from
 https://dlang.org/library/std/concurrency/receive_only.html
 
 ´´´
 import std.experimental.all;
 
 struct D
 {
      size_t d;
      static S s;
 }
 struct S
 {
      D[] data;
 }
 
 struct Model
 {
      auto ref s()
      {
          return D.s;
      }
 
      void run()
      {
          "I'm running".writeln;
          writeln(s.data.length);
      }
 }
 
 Model m;
 
 void main()
 {
 
      D.s.data.length = 4;
      m.run; //4
 
      auto childTid = spawn(&runner, thisTid);
      send(childTid, 0);
      receiveOnly!bool;
 }
 
 static void runner(Tid ownerTid)
 {
      receive((size_t dummy){
          import core.thread : Thread;
 
          m.run;
          // Send a message back to the owner thread
          // indicating success.
          send(ownerTid, true);
      });
 }
 ´´´
 
 The idea is:
 the model is something that I can declare deliberately in the 
 application. And, I assumed that if it is (globally) shared, then so are 
 all compartments of it, even if they are not explicitly part of the model.
 
 Some problems arose:
 1. Obviously, this is not the case, as the output is different, 
 depending on the thread I start the model function.

Yes, unless you declare the model to be shared, there is a copy made for 
each thread, independently managed.

 2. If I declare the model object inside the main, the compiler aborts 
 with the message "Aliases to mutable thread-local data not allowed."

Right, because you are not allowed to pass unshared data between threads.

 3. If I mark the S instance as shared, it works. But I didn't intend to 
 do this... Is this really how it meant to be?

Let's go over how the data is actually laid out:

Model has NO data in it, so it doesn't really matter if it's shared or not.

S has a single array of element type D's. There is no static data in S, 
so it has no static state (only instance state).

D has a single size_t, which is thread-local, but has a static instance 
of S in the TYPE. There is not a copy of an S for each D, just a single 
copy for each THREAD. If you make this shared, it's a shared copy for 
all threads. This means the array inside the shared S will be shared 
between all threads too.

What happens when you spawn a new thread is that the thread-local copy 
of m is created with Model.init (but it has no data, so it's not 
important). A thread-local copy of D.s is created with S.init (so an 
empty array). The reason your assignment of length in main doesn't work 
is because the init value is used, not the current value from the main 
thread.

So yes, you need to make it shared to have the sub-thread see the 
changes, if that's what you are after.

 As I'm writing the model object as well as all of its compartments, I 
 can do almost everything... but what I to avoid is to declare the 
 instances of compartments inside the model:
 They are stored locally to their modules and the single elements of them 
 have to know about the compound objects, like with the D and S structs 
 shown.

A static member is stored per thread. If you want a global that's shared 
between all threads, you need to make it shared. But the result may not 
be what you are looking for, shared can cause difficulty if your code 
wasn't written to deal with it (and a lot of code isn't).

-Steve

Nov 26 2018

Alex <sascha.orlov gmail.com> writes:

On Monday, 26 November 2018 at 14:28:33 UTC, Steven Schveighoffer 
wrote:
 Some problems arose:
 1. Obviously, this is not the case, as the output is 
 different, depending on the thread I start the model function.

 Yes, unless you declare the model to be shared, there is a copy 
 made for each thread, independently managed.

 2. If I declare the model object inside the main, the compiler 
 aborts with the message "Aliases to mutable thread-local data 
 not allowed."

 Right, because you are not allowed to pass unshared data 
 between threads.

 3. If I mark the S instance as shared, it works. But I didn't 
 intend to do this... Is this really how it meant to be?

 Let's go over how the data is actually laid out:

 Model has NO data in it, so it doesn't really matter if it's 
 shared or not.

 S has a single array of element type D's. There is no static 
 data in S, so it has no static state (only instance state).

 D has a single size_t, which is thread-local,

especially, the size_t is local to an instance of D.

 but has a static instance of S in the TYPE.

Right, because to work properly a D instance has to know about 
the D's in the array of S

 There is not a copy of an S for each D, just a single copy for 
 each THREAD. If you make this shared, it's a shared copy for 
 all threads. This means the array inside the shared S will be 
 shared between all threads too.

This is the point where my headaches begin: I do not need this 
level of sharedness, but I don't really care.

 What happens when you spawn a new thread is that the 
 thread-local copy of m is created with Model.init (but it has 
 no data, so it's not important). A thread-local copy of D.s is 
 created with S.init (so an empty array). The reason your 
 assignment of length in main doesn't work is because the init 
 value is used, not the current value from the main thread.

 So yes, you need to make it shared to have the sub-thread see 
 the changes, if that's what you are after.

 As I'm writing the model object as well as all of its 
 compartments, I can do almost everything... but what I to 
 avoid is to declare the instances of compartments inside the 
 model:
 They are stored locally to their modules and the single 
 elements of them have to know about the compound objects, like 
 with the D and S structs shown.

 A static member is stored per thread. If you want a global 
 that's shared between all threads, you need to make it shared. 
 But the result may not be what you are looking for, shared can 
 cause difficulty if your code wasn't written to deal with it 
 (and a lot of code isn't).

Well, the only reason I use multithreading is this:
https://forum.dlang.org/thread/cfrtilrtbahollmazzfv forum.dlang.org

So, even if my code is not really shared designed, this doesn't 
matter, as I wait for "the other" thread to end (or interrupt 
it). So, marking the model as shared is already a workaround, for 
being able to pass it to another thread, which I don't really 
need. However, now, if also all components of the model have to 
be marked shared, the workaround has to grow and expands over all 
components (?). This is the reason for this question...

 -Steve

Nov 26 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/26/18 10:16 AM, Alex wrote:
 On Monday, 26 November 2018 at 14:28:33 UTC, Steven Schveighoffer wrote:

 A static member is stored per thread. If you want a global that's 
 shared between all threads, you need to make it shared. But the result 
 may not be what you are looking for, shared can cause difficulty if 
 your code wasn't written to deal with it (and a lot of code isn't).

 
 Well, the only reason I use multithreading is this:
 https://forum.dlang.org/thread/cfrtilrtbahollmazzfv forum.dlang.org
 
 So, even if my code is not really shared designed, this doesn't matter, 
 as I wait for "the other" thread to end (or interrupt it). So, marking 
 the model as shared is already a workaround, for being able to pass it 
 to another thread, which I don't really need. However, now, if also all 
 components of the model have to be marked shared, the workaround has to 
 grow and expands over all components (?). This is the reason for this 
 question...
 

Well, if you want to run calculations in another thread, then send the 
result back to the original, you may be better off sending the state 
needed for the calculation to the worker thread, and receiving the 
result back via the messaging system. It's really hard to know the 
requirements with such toy examples, so maybe that's not workable for you.

What it seems like you need is a way to run the calculations in a 
separate thread. But with multiple threads comes all the dangers of 
concurrency and races. So you have to be very careful about how you 
design this.

At this point, std.concurrency does not have the ability to safely pass 
mutable data to another thread without it being shared.

Note that if you want to do it without safety in place, you can use the 
Thread class in core.thread which has no requirements for data to be 
immutable or shared. But you have to be even more careful about how you 
access the data.

-Steve

Nov 26 2018

Alex <sascha.orlov gmail.com> writes:

On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer 
wrote:
 Well, if you want to run calculations in another thread, then 
 send the result back to the original, you may be better off 
 sending the state needed for the calculation to the worker 
 thread, and receiving the result back via the messaging system.

How to do this, if parts of the state are statically saved in a 
type?

 Note that if you want to do it without safety in place, you can 
 use the Thread class in core.thread which has no requirements 
 for data to be immutable or shared. But you have to be even 
 more careful about how you access the data.

Ah... ok. But then, I will prefer to mark the appropriate parts 
as shared, I think...

Nov 26 2018

Steven Schveighoffer <schveiguy gmail.com> writes:

On 11/26/18 10:37 AM, Alex wrote:
 On Monday, 26 November 2018 at 15:26:43 UTC, Steven Schveighoffer wrote:
 Well, if you want to run calculations in another thread, then send the 
 result back to the original, you may be better off sending the state 
 needed for the calculation to the worker thread, and receiving the 
 result back via the messaging system.

 
 How to do this, if parts of the state are statically saved in a type?

For instance, with your toy example, instead of saving the D[] as a 
static instance to share with all threads, use idup to make a complete 
copy, and then send that array directly to the new thread via spawn. 
When the result is done, instead of sending a bool to say it's complete, 
send the answer.

Sending an immutable copy is the easiest way to ensure you have no 
races. It may be more expensive than you want to make a deep copy of 
something, but probably less expensive than the headache of creating a 
non-debuggable monster race condition.

 
 Note that if you want to do it without safety in place, you can use 
 the Thread class in core.thread which has no requirements for data to 
 be immutable or shared. But you have to be even more careful about how 
 you access the data.

 
 Ah... ok. But then, I will prefer to mark the appropriate parts as 
 shared, I think...

Right :)

Threading is always very difficult to get right, and usually very 
difficult to find errors when you get it wrong. I remember working with 
pthreads about 20 years ago in a C++ project, and having a data race 
that caused a hang once every *2 weeks*. It took insane amounts of 
printouts and logging to figure out exactly why it happened (and the 
cycle was 2 weeks roughly), and the cause was (I think, not 100% sure) a 
place where a lock should have been but wasn't used.

-Steve

Nov 26 2018

Alex <sascha.orlov gmail.com> writes:

On Monday, 26 November 2018 at 16:27:23 UTC, Steven Schveighoffer 
wrote:
 On 11/26/18 10:37 AM, Alex wrote:
 On Monday, 26 November 2018 at 15:26:43 UTC, Steven 
 Schveighoffer wrote:
 Well, if you want to run calculations in another thread, then 
 send the result back to the original, you may be better off 
 sending the state needed for the calculation to the worker 
 thread, and receiving the result back via the messaging 
 system.

 
 How to do this, if parts of the state are statically saved in 
 a type?

 For instance, with your toy example, instead of saving the D[] 
 as a static instance to share with all threads, use idup to 
 make a complete copy, and then send that array directly to the 
 new thread via spawn. When the result is done, instead of 
 sending a bool to say it's complete, send the answer.

 Sending an immutable copy is the easiest way to ensure you have 
 no races. It may be more expensive than you want to make a deep 
 copy of something, but probably less expensive than the 
 headache of creating a non-debuggable monster race condition.

Yeah... the problem is:
the D[] array is stored statically not because of threads, but 
because every element of it has to have an access to it. So not 
to store it statically is the very point I want to avoid.

But the idea is clear now, I think: I should delay the array 
expansion until the object is transferred to the other thread. 
Then, I expand the whole thing, (statically, as I would like to) 
do my calculations there and send back results, as you proposed.
In this way, the object to copy will be a simple copy, because 
everything that would need a deep copy will be created in the 
proper thread, after the transfer.

Thanks :)

Nov 26 2018

D Programming

C/C++ Programming

Other

digitalmars.D.learn - handling shared objects