www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Semantics of shared

reply Matt <gelfmrogen yahoo.com> writes:
[from reddit]

There was just a post to reddit announcing that thread local storage would be
the default for global variables and that the 'shared' qualifier would make
this happen.   What I can't find is a description of typing rules surrounding
'shared'.   From the discussion at reddit, it sounded like 'shared' was
intended to mean 'possibly shared', with the implication that thread local
objects can be treated as 'possibly shared'.  

The problem I see with this is that it implies that it is not safe to assign
one shared reference to another, because the former may actually be thread
local while the latter is actually global.  This would seem to make the "maybe
shared" concept pretty useless.  Is this not a problem?   Or if not, can
someone clarify to me what the actual semantics & typing rules are?

Thanks,
Matt
May 13 2009
next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 13 May 2009 23:44:32 -0400, Matt <gelfmrogen yahoo.com> wrote:
 [from reddit]

 There was just a post to reddit announcing that thread local storage  
 would be the default for global variables and that the 'shared'  
 qualifier would make this happen.   What I can't find is a description  
 of typing rules surrounding 'shared'.   From the discussion at reddit,  
 it sounded like 'shared' was intended to mean 'possibly shared', with  
 the implication that thread local objects can be treated as 'possibly  
 shared'.
 The problem I see with this is that it implies that it is not safe to  
 assign one shared reference to another, because the former may actually  
 be thread local while the latter is actually global.  This would seem to  
 make the "maybe shared" concept pretty useless.  Is this not a  
 problem?   Or if not, can someone clarify to me what the actual  
 semantics & typing rules are?

 Thanks,
 Matt
I'm posting Walter's reply from reddit: WalterBright 4 points 6 hours ago[-] You're right about the return types of accessors, though we plan to address this. But if I may make some corrections, C++ has four versions (none, const, volatile, and const volatile), while D has five (none, const, immutable, shared, and shared const). The shared immutable is not counted since is the same as immutable. I don't see a place for "maybe shared" that isn't already handled by simply "shared".
May 13 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Robert Jacques wrote:
 I don't see a place for "maybe shared" that isn't already handled by 
 simply "shared".
I gave a flip and incomplete answer there. I'm not sure there is even a point to a function that could handle both shared and unshared with the same code. First of all, sharing is going to need some sort of synchronization; you're going to try and minimize the amount of code that has to deal with shared. I can't see trying to run an in-place sort on a shared array, for example. Can you imagine two threads trying to sort the same array? You're going to want to approach manipulating shared data differently than unshared.
May 13 2009
next sibling parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 14 May 2009 01:27:15 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Robert Jacques wrote:
 I don't see a place for "maybe shared" that isn't already handled by  
 simply "shared".
I gave a flip and incomplete answer there. I'm not sure there is even a point to a function that could handle both shared and unshared with the same code. First of all, sharing is going to need some sort of synchronization; you're going to try and minimize the amount of code that has to deal with shared. I can't see trying to run an in-place sort on a shared array, for example. Can you imagine two threads trying to sort the same array? You're going to want to approach manipulating shared data differently than unshared.
I agree for POD, but what classes where the synchronization is encapsulated behind a virtual function call? Also, does this mean 'scope' as a type is going away?
May 13 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Robert Jacques wrote:
 I agree for POD, but what classes where the synchronization is 
 encapsulated behind a virtual function call?
synchronization can make a shared reference "tail shared".
 Also, does this mean 'scope' as a type is going away?
Scope never was a type, it's a storage class.
May 13 2009
parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 14 May 2009 02:13:37 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Robert Jacques wrote:
 I agree for POD, but what classes where the synchronization is  
 encapsulated behind a virtual function call?
synchronization can make a shared reference "tail shared".
I agree, but that doesn't seem answer my question. Put another way, if I have an interface I which is implemented by both a thread local class L and a shared class S, then does some function F need to know about whether the implementor of I is S or L? P.S. There will obviously be some interfaces S can't implement, but that a separate issue.
 Also, does this mean 'scope' as a type is going away?
Scope never was a type, it's a storage class.
Sorry for the confusion of terminology. However, you talk blog about using the 'scope' keyword to support escape analysis, ettc. i.e. 'scope' would become the 'const' of the shared-thread local-stack storage type system. Is this still the plan?
May 14 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Robert Jacques wrote:
 On Thu, 14 May 2009 02:13:37 -0400, Walter Bright 
 <newshound1 digitalmars.com> wrote:
 
 Robert Jacques wrote:
 I agree for POD, but what classes where the synchronization is 
 encapsulated behind a virtual function call?
synchronization can make a shared reference "tail shared".
I agree, but that doesn't seem answer my question. Put another way, if I have an interface I which is implemented by both a thread local class L and a shared class S, then does some function F need to know about whether the implementor of I is S or L?
Since a reference to thread local cannot be implicitly cast to shared, then this scenario cannot happen - i.e. a shared function is not covariant with an unshared one.
 P.S. There will obviously be some interfaces S can't implement, but that 
 a separate issue.
 
 Also, does this mean 'scope' as a type is going away?
Scope never was a type, it's a storage class.
Sorry for the confusion of terminology. However, you talk blog about using the 'scope' keyword to support escape analysis, ettc. i.e. 'scope' would become the 'const' of the shared-thread local-stack storage type system. Is this still the plan?
I'm not sure what you mean by that.
May 14 2009
parent reply "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 14 May 2009 13:12:46 -0400, Walter Bright  
<newshound1 digitalmars.com> wrote:

 Robert Jacques wrote:
 On Thu, 14 May 2009 02:13:37 -0400, Walter Bright  
 <newshound1 digitalmars.com> wrote:

 Robert Jacques wrote:
 I agree for POD, but what classes where the synchronization is  
 encapsulated behind a virtual function call?
synchronization can make a shared reference "tail shared".
I agree, but that doesn't seem answer my question. Put another way, if I have an interface I which is implemented by both a thread local class L and a shared class S, then does some function F need to know about whether the implementor of I is S or L?
Since a reference to thread local cannot be implicitly cast to shared, then this scenario cannot happen - i.e. a shared function is not covariant with an unshared one.
 P.S. There will obviously be some interfaces S can't implement, but  
 that a separate issue.

 Also, does this mean 'scope' as a type is going away?
Scope never was a type, it's a storage class.
Sorry for the confusion of terminology. However, you talk blog about using the 'scope' keyword to support escape analysis, ettc. i.e. 'scope' would become the 'const' of the shared-thread local-stack storage type system. Is this still the plan?
I'm not sure what you mean by that.
I'm asking about the use of scope you blogged about: http://dobbscodetalk.com/index.php?option=com_myblog&show=Escape-Analysis.html&Itemid=29 [...] For D, we are looking at a design that creates a parameter storage class called scope: T foo(scope int* p); The presence of scope means that the function will not allow the parameter, or anything reachable through that parameter, to escape from the scope of the function. The scope storage class can be applied to the parameters or the 'this' reference (for member functions). Initially, this will be a promise by the implementor of foo(), but it should be entirely possible for the compiler to perform escape analysis using data flow analysis techniques on the implementation of foo() to ensure it. The caller of the function will know that a reference to a local variable can be safely passed as a scope parameter. A million line program can be automatically verified as being free of escaping reference bugs.
May 14 2009
parent Walter Bright <newshound1 digitalmars.com> writes:
Robert Jacques wrote:
  Sorry for the confusion of terminology. However, you talk blog about 
 using the 'scope' keyword to support escape analysis, ettc. i.e. 
 'scope' would become the 'const' of the shared-thread local-stack 
 storage type system. Is this still the plan?
I'm not sure what you mean by that.
I'm asking about the use of scope you blogged about: http://dobbscodetalk.com/index.php?option=com_myblog&show=Escape-Ana ysis.html&Itemid=29 [...] For D, we are looking at a design that creates a parameter storage class called scope: T foo(scope int* p); The presence of scope means that the function will not allow the parameter, or anything reachable through that parameter, to escape from the scope of the function. The scope storage class can be applied to the parameters or the 'this' reference (for member functions). Initially, this will be a promise by the implementor of foo(), but it should be entirely possible for the compiler to perform escape analysis using data flow analysis techniques on the implementation of foo() to ensure it. The caller of the function will know that a reference to a local variable can be safely passed as a scope parameter. A million line program can be automatically verified as being free of escaping reference bugs.
We've talked about it, but am unsure as to whether it will work or not.
May 14 2009
prev sibling parent reply Matt <gelfmrogen yahoo.com> writes:
Walter Bright Wrote:

 Robert Jacques wrote:
 I don't see a place for "maybe shared" that isn't already handled by 
 simply "shared".
I gave a flip and incomplete answer there. I'm not sure there is even a point to a function that could handle both shared and unshared with the same code. [...] You're going to want to approach manipulating shared data differently than unshared.
Ok, so there is no cast from 'shared' to 'not shared' or vice-versa, so it's sound. Sorry, quotes like the above from Robert confused me. But now I'm confused by the idea that you wouldn't want to use the same code on shared and unshared data. The usual approach in C or C++ in dealing with shared data is to first acquire a lock and then to run code that would have been otherwise safe on the data. Is there some way to cast shared to thread local when a local has been acquired?
  Can you imagine two threads trying to sort the same array?
Not at the same time, but yes.
May 14 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Matt wrote:
 But now I'm confused by the idea that you wouldn't want to use the
 same code on shared and unshared data.  The usual approach in C or
 C++ in dealing with shared data is to first acquire a lock and then
 to run code that would have been otherwise safe on the data.  Is
 there some way to cast shared to thread local when a local has been
 acquired?
Shared data becomes unshared for the duration of a lock on it. The problem with this is: 1. determining that there are no other shared references into that data. 2. determining that the code operating on that data doesn't squirrel away a thread local reference to it. Currently, Bartosz is working on these problems. There is no solution yet other than using an (unsafe) cast and relying on the user not to screw it up.
 Can you imagine two threads trying to sort the same array?
Not at the same time, but yes.
That's why there's no way one would do this with simply shared data. Locks would be needed, too.
May 14 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

 Matt wrote:
 
 Shared data becomes unshared for the duration of a lock on it. 
Is that a statement of fact? Or is it just speculation leading to the issues below? Even if this was changed to "scope unshared", that still is really hairy since scope is a storage class.
 The problem with this is:
 
 1. determining that there are no other shared references into that data.
 
 2. determining that the code operating on that data doesn't squirrel 
 away a thread local reference to it.
 
 Currently, Bartosz is working on these problems. There is no solution 
 yet other than using an (unsafe) cast and relying on the user not to 
 screw it up.
His last blog implied he was further. I thought the recent shift to use TLS and shared inside dmd was because a design had been worked out. Without that, D hardly helps writing correct multi-threaded code :( It may be that I'm being overly pessimistic...
 Can you imagine two threads trying to sort the same array?
Not at the same time, but yes.
That's why there's no way one would do this with simply shared data. Locks would be needed, too.
May 14 2009
parent Walter Bright <newshound1 digitalmars.com> writes:
Jason House wrote:
 His last blog implied he was further. I thought the recent shift to
 use TLS and shared inside dmd was because a design had been worked
 out. Without that, D hardly helps writing correct multi-threaded code
 :( It may be that I'm being overly pessimistic...
Even if D goes *no further* with shared than where it is now, it still helps a *lot* in writing correct multi-threaded code. It does this by making the points where threads communicate obvious, instead of inadvertent and hidden.
May 14 2009
prev sibling next sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Matt wrote:
 There was just a post to reddit announcing that thread local storage
 would be the default for global variables and that the 'shared'
 qualifier would make this happen.   What I can't find is a
 description of typing rules surrounding 'shared'.   From the
 discussion at reddit, it sounded like 'shared' was intended to mean
 'possibly shared', with the implication that thread local objects can
 be treated as 'possibly shared'.
 
 The problem I see with this is that it implies that it is not safe to
 assign one shared reference to another, because the former may
 actually be thread local while the latter is actually global.  This
 would seem to make the "maybe shared" concept pretty useless.  Is
 this not a problem?   Or if not, can someone clarify to me what the
 actual semantics & typing rules are?
The addresses of thread local data cannot be implicitly cast to shared. If they could, then they could be accessed by other threads, and the whole safety of them being thread local is compromised. Shared data must be created as shared, or cast to be shared. Casting to shared is an implicitly unsafe operation, relying on the user ensuring that it is safe to do so. Thread local data can point to shared data, but cannot be implicitly cast to shared.
May 13 2009
prev sibling next sibling parent reply Matt <gelfmrogen yahoo.com> writes:
Matt Wrote:

 Is there some way to cast shared to thread local when a local has been
acquired?
It occurs to me that your plan is probably: with an explicit cast when the lock is acquired. So in practice 'unshared' is really going to mean something more like 'exclusively owned' and these modifiers aren't really going to help with managing what data is actually thread local vs. in the global heap. The purpose of the 'shared' annotation is then just to warn you about unlocked data and to serve a similar purpose as 'volatile' for code generation. Is this about right?
May 14 2009
parent reply Jason House <jason.james.house gmail.com> writes:
Matt Wrote:

 Matt Wrote:
 
 Is there some way to cast shared to thread local when a local has been
acquired?
It occurs to me that your plan is probably: with an explicit cast when the lock is acquired. So in practice 'unshared' is really going to mean something more like 'exclusively owned' and these modifiers aren't really going to help with managing what data is actually thread local vs. in the global heap. The purpose of the 'shared' annotation is then just to warn you about unlocked data and to serve a similar purpose as 'volatile' for code generation. Is this about right?
I don't think so. What happens when the lock is released? Any residual use of the cast data is going to be incorrect (lacks a lock). What that means is that the cast is unsafe even when the data is locked! Casting is a back door that should be used with extreme care.
May 14 2009
parent reply Matt <gelfmrogen yahoo.com> writes:
Jason House Wrote:

 Matt Wrote:
 Is this about right?
I don't think so. What happens when the lock is released? Any residual use of the cast data is going to be incorrect (lacks a lock). What that means is that the cast is unsafe even when the data is locked! Casting is a back door that should be used with extreme care.
You're right - it's a big departure from C++ where casting is to be avoided if possible, and you're right that the cast isn't safe in general. It's semi-reasonable because it's safe unless you cast somewhere and the C++ multi-threaded programming model is unsafe anyway. But it's bad that the bugs introduced can be quite far from the casts (e.g. if cast-to-unshared data is placed in a global thread-local somewhere). But I don't see how you can do much of anything useful with shared data if something like this isn't the plan. Tracking what data is protected by lock and when is going to be outside of the scope of any type system you're going to want for D (and is going to be undecidable in general).
May 14 2009
next sibling parent Jason House <jason.james.house gmail.com> writes:
Matt Wrote:

 Jason House Wrote:
 
 Matt Wrote:
 Is this about right?
I don't think so. What happens when the lock is released? Any residual use of the cast data is going to be incorrect (lacks a lock). What that means is that the cast is unsafe even when the data is locked! Casting is a back door that should be used with extreme care.
You're right - it's a big departure from C++ where casting is to be avoided if possible, and you're right that the cast isn't safe in general. It's semi-reasonable because it's safe unless you cast somewhere and the C++ multi-threaded programming model is unsafe anyway. But it's bad that the bugs introduced can be quite far from the casts (e.g. if cast-to-unshared data is placed in a global thread-local somewhere). But I don't see how you can do much of anything useful with shared data if something like this isn't the plan. Tracking what data is protected by lock and when is going to be outside of the scope of any type system you're going to want for D (and is going to be undecidable in general).
We're all still waiting to hear how D will handle locking under the covers. I don't think locking will be used to change types. A shared variable will always be shares, even if you have the lock.
May 14 2009
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Matt wrote:
 You're right - it's a big departure from C++ where casting is to be
 avoided if possible, and you're right that the cast isn't safe in
 general.  It's semi-reasonable because it's safe unless you cast
 somewhere and the C++ multi-threaded programming model is unsafe
 anyway.  But it's bad that the bugs introduced can be quite far from
 the casts (e.g. if cast-to-unshared data is placed in a global
 thread-local somewhere).
It's not all bad. The huge advantage with "shared" is when you do a code review, it points you to where the potential trouble spots are. With the implicit sharing in C/C++, the whole program is a trouble spot.
May 14 2009
prev sibling parent reply Jason House <jason.james.house gmail.com> writes:
Robert Jacques Wrote:

 On Thu, 14 May 2009 02:13:37 -0400, Walter Bright  
 <newshound1 digitalmars.com> wrote:
 
 Robert Jacques wrote:
 I agree for POD, but what classes where the synchronization is  
 encapsulated behind a virtual function call?
synchronization can make a shared reference "tail shared".
I agree, but that doesn't seem answer my question. Put another way, if I have an interface I which is implemented by both a thread local class L and a shared class S, then does some function F need to know about whether the implementor of I is S or L?
Shared data needs fundamentally different handling than thread local data. I expect "shared I" and "__thread I" to be handled differently. You can't store an S where an L is expected... It can break code.
 P.S. There will obviously be some interfaces S can't implement, but that a  
 separate issue.
 
 Also, does this mean 'scope' as a type is going away?
Of course not. Scope storage class will remain.
 Scope never was a type, it's a storage class.
Sorry for the confusion of terminology. However, you talk blog about using the 'scope' keyword to support escape analysis, ettc. i.e. 'scope' would become the 'const' of the shared-thread local-stack storage type system. Is this still the plan?
May 14 2009
parent "Robert Jacques" <sandford jhu.edu> writes:
On Thu, 14 May 2009 08:51:37 -0400, Jason House  
<jason.james.house gmail.com> wrote:

 Robert Jacques Wrote:

 On Thu, 14 May 2009 02:13:37 -0400, Walter Bright
 <newshound1 digitalmars.com> wrote:

 Robert Jacques wrote:
 I agree for POD, but what classes where the synchronization is
 encapsulated behind a virtual function call?
synchronization can make a shared reference "tail shared".
I agree, but that doesn't seem answer my question. Put another way, if I have an interface I which is implemented by both a thread local class L and a shared class S, then does some function F need to know about whether the implementor of I is S or L?
Shared data needs fundamentally different handling than thread local data. I expect "shared I" and "__thread I" to be handled differently. You can't store an S where an L is expected... It can break code.
 P.S. There will obviously be some interfaces S can't implement, but  
 that a
 separate issue.

 Also, does this mean 'scope' as a type is going away?
Of course not. Scope storage class will remain.
The use of scope I'm talking about (see below) isn't even implemented yet, so how can it remain? It was Walter bogged a while ago about using the scope keyword to aid escape analysis, which would provide a common type for shared-local-stack allocation. I'm not referring to the use of 'scope' to stack allocate a class.
 Scope never was a type, it's a storage class.
Sorry for the confusion of terminology. However, you talk blog about using the 'scope' keyword to support escape analysis, ettc. i.e. 'scope' would become the 'const' of the shared-thread local-stack storage type system. Is this still the plan?
May 14 2009