www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Something needs to happen with shared, and soon.

reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
Hi,

It's starting to get outright embarrassing to talk to newcomers about 
D's concurrency support because the most fundamental part of it -- the 
shared type qualifier -- does not have well-defined semantics at all.

I'm certainly not alone in being annoyed by this state of affairs: 
http://d.puremagic.com/issues/show_bug.cgi?id=8993

I've posted rants about the state of shared before and, from the 
comments on those, it appears that what most people want shared to do is 
at least one (and usually multiple) of

* make variables global (if appropriate in the context);
* make the wrapped type completely separate from the unwrapped type;
* make all operations be atomic;
* make all operations result in memory barriers.

At a glance, this looks fine. Exactly what you would want for shared 
types in a concurrent setting, right?

Except, not really. I'll try to explain all of the unsolved problems 
with shared below...

First of all, the fact that shared(T) is completely separate from T 
(i.e. no conversions allowed, except for primitive types) is a huge 
usability problem. In practice, it means that 99% of the standard 
library is unusable with shared types. Hell, even most of the runtime 
doesn't work with shared types. I don't know how to best solve this 
particular problem; I'm just pointing it out because anyone who tries to 
do anything non-trivial with shared will invariably run into this.

Second, the idea of making shared insert atomic operations is an 
absolute fallacy. It only makes sense for primitive types for the most 
part, and even for those, what sizes are supported depends on the target 
architecture. A number of ideas have come up to solve this problem:

* We make shared(T) not compile for certain Ts depending on the target 
architecture. I personally think this is a terrible idea because most 
code using shared will not be portable at all.
* We require any architecture D targets to support atomic operations for 
a certain size S at the very least. This is fine for primitives up to 64 
bits in size, but doesn't clear up the situation for larger types (real, 
complex types, cent/ucent, ...).
* We make shared not insert atomic operations at all (thus making it 
kind of useless for anything but documentation).
* (Possibly others I have forgotten; please let me know if this is the 
case.)

I don't think any of these are particularly attractive, to be honest. If 
we do make shared insert atomic operations, we would also have to 
consider the memory ordering of those operations.

Third, we have memory barriers. I strongly suspect that this is a 
misnomer in most cases where people have suggested this; it's generally 
not useful to have a compiler insert barriers because they are used to 
control ordering of load/store operations which is something the 
programmer will want to do explicitly. In any case, the compiler can't 
usefully figure out where to put barriers, so it would just result in 
really bad performance for no apparent gain.

Fourth, there is implementation complexity. If shared is meant to insert 
specialized instructions, it will result in effectively two code paths 
for most code generation in any D compiler (read: maintenance nightmare).

Fifth, it is completely unclear whether casting to and from shared is 
legal (but with a big fat "caution" sign like casting away const) or if 
it's undefined behavior. Making it undefined behavior would further 
increase the usability problem I described above.

And finally, the worst part of all of this? People writing code that 
uses shared today are blindly assuming it actually does the right thing. 
It doesn't. Their code will break on any non-x86 platform. This is an 
absolutely horrifying situation now that ARM, MIPS, and PowerPC are 
starting to become viable targets for D.

Something needs to be done about shared. I don't know what, but the 
current situation is -- and I'm really not exaggerating here -- 
laughable. I think we either need to just make it perfectly clear that 
shared is for documentation purposes and nothing else, or, figure out an 
alternative system to shared, because I don't see shared actually being 
useful for real world work no matter what we do with it.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org
Nov 11 2012
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 11/11/12, Alex R=F8nne Petersen <alex lycus.org> wrote:
 And finally, the worst part of all of this? People writing code that
 uses shared today are blindly assuming it actually does the right thing.
 It doesn't.
I think most people probably don't even use shared due to lacking Phobos support. E.g. http://d.puremagic.com/issues/show_bug.cgi?id=3D7036 Not even using the write functions worked on shared types until 2.059 (e.g. printing shared arrays). 'shared' has this wonderfully attractive name to it, but apparently it doesn't have much guarantees? E.g. Walter's comment here: http://d.puremagic.com/issues/show_bug.cgi?id=3D8077#c1 So +1 from me just because I have no idea what shared is supposed to guarantee. I've just stubbornly used __gshared variables because std.concurrency.send() doesn't accept mutable data. send() doesn't work with shared either, so I have no clue.. :)
Nov 11 2012
parent "Chris Nicholson-Sauls" <ibisbasenji gmail.com> writes:
On Sunday, 11 November 2012 at 19:28:30 UTC, Andrej Mitrovic 
wrote:
 On 11/11/12, Alex Rønne Petersen <alex lycus.org> wrote:
 And finally, the worst part of all of this? People writing 
 code that
 uses shared today are blindly assuming it actually does the 
 right thing.
 It doesn't.
I think most people probably don't even use shared due to lacking Phobos support. E.g. http://d.puremagic.com/issues/show_bug.cgi?id=7036 Not even using the write functions worked on shared types until 2.059 (e.g. printing shared arrays). 'shared' has this wonderfully attractive name to it, but apparently it doesn't have much guarantees? E.g. Walter's comment here: http://d.puremagic.com/issues/show_bug.cgi?id=8077#c1 So +1 from me just because I have no idea what shared is supposed to guarantee. I've just stubbornly used __gshared variables because std.concurrency.send() doesn't accept mutable data. send() doesn't work with shared either, so I have no clue.. :)
Fix support for shared(T) in std.variant, and you will have fixed send() as well. Meanwhile, in common cases a simple wrapper struct suffices. module toy; import std.concurrency, std.stdio; struct SImpl { string s; int i; } alias shared( SImpl ) S; struct Msg { S s; } struct Quit {} S global = S( "global", 999 ); void main () { auto child = spawn( &task ); S s = S( "abc", 42 ); child.send( Msg( s ) ); child.send( Msg( global ) ); child.send( Quit() ); } void task () { bool sentinel = true; while ( sentinel ) { receive( ( Msg msg ) { writeln( msg.s.s, " -- ", msg.s.i ); }, ( Quit msg ) { sentinel = false; } ); } } grant aesgard ~/Projects/D/foo/shared_test $ dmd toy && ./toy abc -- 42 global -- 999 -- Chris Nicholson-Sauls
Nov 11 2012
prev sibling next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Fully agree.

Kind Regards
Benjamin Thaut
Nov 11 2012
parent reply "martin" <kinke libero.it> writes:
On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:
 Fully agree.
+1
Nov 11 2012
parent reply Graham St Jack <Graham.StJack internode.on.net> writes:
On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:

 On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:
 Fully agree.
+1
+1. I find it so broken that I have to avoid using it in all but the most trivial situations.
Nov 11 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 11/11/2012 23:36, Graham St Jack a écrit :
 On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:

 On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:
 Fully agree.
+1
+1. I find it so broken that I have to avoid using it in all but the most trivial situations.
That isn't a bad thing in itself.
Nov 11 2012
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, November 12, 2012 01:17:06 deadalnix wrote:
 Le 11/11/2012 23:36, Graham St Jack a =C3=A9crit :
 On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:
 On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:=
 Fully agree.
=20 +1
=20 +1. =20 I find it so broken that I have to avoid using it in all but the mo=
st
 trivial situations.
=20 That isn't a bad thing in itself.
I don' think that it's really intended that shared by 100% easy to use.= You're=20 _supposed_ to use it sparingly. But at this point, it borders on being = utterly=20 unusable. We have a bit of a problem with the basic idea though in that you're no= t=20 supposed to be using shared much, and it's supposed to be segregated su= ch that=20 having the shared equivalent of const (as in it works with both shared = and=20 non-shared) would pose a big problem (it's also probably untenable with= memory=20 barriers and the like), but if you _don't_ have something like that, yo= u=20 either can't use shared with much of anything, or you have to cast it a= way all=20 over the place, which loses all of the memory barriers or whatnot. We h= ave=20 conflicting requirements which aren't being managed very well. I don't know how protected shared really needs to be though. Anything=20= involving shared should make heavy use of mutexes and synchronized and = whatnot=20 meaning that at least some of the protections that people want with sha= red are=20 useless unless you're writing code which is being stupid and not using = mutexes=20 or whatnot. So, casting away shared might not actually be that big a de= al so=20 long as it's temporary to call a function (as opposed to stashing the v= ariable=20 away somewhere) and that call is protected by a mutex or other thread- protection mechanism. At the moment, I think that the only way to make stuff work with both s= hared=20 and unshared (aside from using lots of casts) is to make use of templat= es, and=20 since most of druntime and Phobos isn't tested with shared, things like= Unqual=20 probably screw with that pretty thoroughly. It's at least conceivable t= hough=20 that stuff like std.algorithm could work with shared just fine. I don't think that there's much question though that shared is the majo= r chink=20 in our armor with regards to thread-local by default. The basic idea is= great,=20 but the details still need some work. - Jonathan M Davis
Nov 11 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Alex Rønne Petersen:

 Something needs to be done about shared. I don't know what,
Maybe deprecate it and introduce something else that is rather different and based on thought-out theory? Bye, bearophile
Nov 11 2012
prev sibling next sibling parent "nixda" <nd o.de> writes:
drop it in favour of :
http://forum.dlang.org/post/k7j1ta$2kv8$1 digitalmars.com


On Sunday, 11 November 2012 at 18:46:12 UTC, Alex Rønne Petersen 
wrote:
 Hi,

 It's starting to get outright embarrassing to talk to newcomers 
 about D's concurrency support because the most fundamental part 
 of it -- the shared type qualifier -- does not have 
 well-defined semantics at all.

 I'm certainly not alone in being annoyed by this state of 
 affairs: http://d.puremagic.com/issues/show_bug.cgi?id=8993

 I've posted rants about the state of shared before and, from 
 the comments on those, it appears that what most people want 
 shared to do is at least one (and usually multiple) of

 * make variables global (if appropriate in the context);
 * make the wrapped type completely separate from the unwrapped 
 type;
 * make all operations be atomic;
 * make all operations result in memory barriers.

 At a glance, this looks fine. Exactly what you would want for 
 shared types in a concurrent setting, right?

 Except, not really. I'll try to explain all of the unsolved 
 problems with shared below...

 First of all, the fact that shared(T) is completely separate 
 from T (i.e. no conversions allowed, except for primitive 
 types) is a huge usability problem. In practice, it means that 
 99% of the standard library is unusable with shared types. 
 Hell, even most of the runtime doesn't work with shared types. 
 I don't know how to best solve this particular problem; I'm 
 just pointing it out because anyone who tries to do anything 
 non-trivial with shared will invariably run into this.

 Second, the idea of making shared insert atomic operations is 
 an absolute fallacy. It only makes sense for primitive types 
 for the most part, and even for those, what sizes are supported 
 depends on the target architecture. A number of ideas have come 
 up to solve this problem:

 * We make shared(T) not compile for certain Ts depending on the 
 target architecture. I personally think this is a terrible idea 
 because most code using shared will not be portable at all.
 * We require any architecture D targets to support atomic 
 operations for a certain size S at the very least. This is fine 
 for primitives up to 64 bits in size, but doesn't clear up the 
 situation for larger types (real, complex types, cent/ucent, 
 ...).
 * We make shared not insert atomic operations at all (thus 
 making it kind of useless for anything but documentation).
 * (Possibly others I have forgotten; please let me know if this 
 is the case.)

 I don't think any of these are particularly attractive, to be 
 honest. If we do make shared insert atomic operations, we would 
 also have to consider the memory ordering of those operations.

 Third, we have memory barriers. I strongly suspect that this is 
 a misnomer in most cases where people have suggested this; it's 
 generally not useful to have a compiler insert barriers because 
 they are used to control ordering of load/store operations 
 which is something the programmer will want to do explicitly. 
 In any case, the compiler can't usefully figure out where to 
 put barriers, so it would just result in really bad performance 
 for no apparent gain.

 Fourth, there is implementation complexity. If shared is meant 
 to insert specialized instructions, it will result in 
 effectively two code paths for most code generation in any D 
 compiler (read: maintenance nightmare).

 Fifth, it is completely unclear whether casting to and from 
 shared is legal (but with a big fat "caution" sign like casting 
 away const) or if it's undefined behavior. Making it undefined 
 behavior would further increase the usability problem I 
 described above.

 And finally, the worst part of all of this? People writing code 
 that uses shared today are blindly assuming it actually does 
 the right thing. It doesn't. Their code will break on any 
 non-x86 platform. This is an absolutely horrifying situation 
 now that ARM, MIPS, and PowerPC are starting to become viable 
 targets for D.

 Something needs to be done about shared. I don't know what, but 
 the current situation is -- and I'm really not exaggerating 
 here -- laughable. I think we either need to just make it 
 perfectly clear that shared is for documentation purposes and 
 nothing else, or, figure out an alternative system to shared, 
 because I don't see shared actually being useful for real world 
 work no matter what we do with it.
Nov 11 2012
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-11 18:46:10 +0000, Alex Rønne Petersen <alex lycus.org> said:

 Something needs to be done about shared. I don't know what, but the 
 current situation is -- and I'm really not exaggerating here -- 
 laughable. I think we either need to just make it perfectly clear that 
 shared is for documentation purposes and nothing else, or, figure out 
 an alternative system to shared, because I don't see shared actually 
 being useful for real world work no matter what we do with it.
I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/ -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 11 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/12/2012 02:48 AM, Michel Fortin wrote:
 On 2012-11-11 18:46:10 +0000, Alex Rønne Petersen <alex lycus.org> said:

 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out
 an alternative system to shared, because I don't see shared actually
 being useful for real world work no matter what we do with it.
I feel like the concurrency aspect of D2 was rushed in the haste of having it ready for TDPL. Shared, deadlock-prone synchronized classes[1] as well as destructors running in any thread (thanks GC!) plus a couple of other irritants makes the whole concurrency scheme completely flawed if you ask me. D2 needs a near complete overhaul on the concurrency front. I'm currently working on a big code base in C++. While I do miss D when it comes to working with templates as well as for its compilation speed and a few other things, I can't say I miss D much when it comes to anything touching concurrency. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I am always irritated by shared-by-default static variables.
Nov 13 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency front.
 
 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.
 
 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I am always irritated by shared-by-default static variables.
I tend to have very little global state in my code, so shared-by-default is not something I have to fight with very often. I do agree that thread-local is a better default. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 13 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, November 13, 2012 22:12:12 Michel Fortin wrote:
 I tend to have very little global state in my code, so
 shared-by-default is not something I have to fight with very often. I
 do agree that thread-local is a better default.
Thread-local by default is a _huge_ step forward, and in hindsight, it seems pretty ridiculous that a language would do anything else. Shared by default is just too horrible. - Jonathan M Davis
Nov 13 2012
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/14/2012 04:12 AM, Michel Fortin wrote:
 On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency
 front.

 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.

 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I am always irritated by shared-by-default static variables.
I tend to have very little global state in my code,
So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x } Unfortunately, this destroys 'pure' even though it actually does not.
 so shared-by-default
 is not something I have to fight with very often.  I do agree that
 thread-local is a better default.
Nov 14 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/14/2012 04:12 AM, Michel Fortin wrote:
 On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:
 
 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency
 front.
 
 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.
 
 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I am always irritated by shared-by-default static variables.
I tend to have very little global state in my code,
So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x }
I'd consider that poor style. Use a struct to encapsulate the state, then make bar, and baz member functions of that struct. The resulting code is cleaner and easier to read: pure int foo() { auto state = State(new_value); state.bar(); return state.baz(); } You could achieve something similar with nested functions too.
 Unfortunately, this destroys 'pure' even though it actually does not.
Using a local-scoped struct would work with pure, be more efficient (accessing thread-local variables takes more cycles), and be less error-prone while refactoring. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 14 2012
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11/14/2012 01:42 PM, Michel Fortin wrote:
 On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/14/2012 04:12 AM, Michel Fortin wrote:
 On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized
 classes[1]
 as well as destructors running in any thread (thanks GC!) plus a
 couple
 of other irritants makes the whole concurrency scheme completely
 flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency
 front.

 I'm currently working on a big code base in C++. While I do miss D
 when
 it comes to working with templates as well as for its compilation
 speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.

 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/
I am always irritated by shared-by-default static variables.
I tend to have very little global state in my code,
So do I. A thread-local static variable does not imply global state. (The execution stack is static.) Eg. in a few cases it is sensible to use static variables as implicit arguments to avoid having to pass them around by copying them all over the execution stack. private int x = 0; int foo(){ int xold = x; scope(exit) x = xold; x = new_value; bar(); // reads x return baz(); // reads x }
I'd consider that poor style.
I'd consider this a poor statement to make. Universally quantified assertions require more rigorous justification. "In a few cases" it is not, even if it is poor style "most of the time".
 Use a struct to encapsulate the state, then make bar, and baz member functions
of that struct.
They could eg. be virtual member functions of a class already.
 Using a local-scoped struct would work with pure,
It might.
 be more efficient
Not necessarily.
 (accessing thread-local variables takes more cycles),
It can be accessed sparsely, copying around the struct pointer is work too, and the fastest access path in a proper alternative design would potentially be even slower.
 and be less error-prone while refactoring.
If done in such a way that it makes refactoring error prone, it is to be considered poor style.
Nov 14 2012
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-14 14:30:19 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/14/2012 01:42 PM, Michel Fortin wrote:
 On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:
 
 So do I. A thread-local static variable does not imply global state.
 (The execution stack is static.) Eg. in a few cases it is sensible to
 use static variables as implicit arguments to avoid having to pass
 them around by copying them all over the execution stack.
 
 private int x = 0;
 
 int foo(){
      int xold = x;
      scope(exit) x = xold;
      x = new_value;
      bar(); // reads x
      return baz(); // reads x
 }
I'd consider that poor style.
I'd consider this a poor statement to make. Universally quantified assertions require more rigorous justification.
Indeed. There's not enough context to judge fairly. I can accept the idea there are situations where it is really inconvenient or impossible to pass the state as an argument. That said, I disagree that this is not using global state. It might not be globally accessible (because x is private), but the state still exists globally since variable x exists in all threads irrespective of whether they use foo or not.
 If done in such a way that it makes refactoring error prone, it is to 
 be considered poor style.
I guess we agree. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 14 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
Nov 11 2012
next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
The only problem beeing that you can not really have user defined shared 
(value) types:

http://d.puremagic.com/issues/show_bug.cgi?id=8295

Kind Regards
Benjamin Thaut
Nov 11 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/11/2012 10:05 PM, Benjamin Thaut wrote:
 The only problem beeing that you can not really have user defined shared
(value)
 types:

 http://d.puremagic.com/issues/show_bug.cgi?id=8295
If you include an object designed to work only in a single thread (non-shared), make it shared, and then destruct it when other threads may be pointing to it ... What should happen?
Nov 11 2012
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 12.11.2012 07:50, schrieb Walter Bright:
 On 11/11/2012 10:05 PM, Benjamin Thaut wrote:
 The only problem beeing that you can not really have user defined
 shared (value)
 types:

 http://d.puremagic.com/issues/show_bug.cgi?id=8295
If you include an object designed to work only in a single thread (non-shared), make it shared, and then destruct it when other threads may be pointing to it ... What should happen?
I'm not talking about objects, I'm talking about value types. And you can't make it work at all. If you do shared ~this() { buf = null; } it won't work either. You don't have _any_ option to destroy a shared struct. Kind Regards Benjamin Thaut
Nov 12 2012
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sun, 11 Nov 2012 18:30:17 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 
 To make a shared type work in an algorithm, you have to:
 
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
 
 Also, all op= need to be disabled for shared types.
But there are also shared member functions and they're kind of annoying right now: * You can't call shared methods from non-shared methods or vice versa. This leads to code duplication, you basically have to implement everything twice: ---------- struct ABC { Mutext mutex; void a() { aImpl(); } shared void a() { synchronized(mutex) aImpl(); //not allowed } private void aImpl() { } } ---------- The only way to avoid this is casting away shared in the shared a method, but that really is annoying. * You can't have data members be included only for the shared version. In the above example, the mutex member will always be included, even if ABC instance is thread local. So you're often better off writing a non-thread safe struct and writing a wrapper struct. This way you don't have useless overhead in the non-thread safe implementation. But the nice instance syntax is lost: shared(ABC) abc1; ABC abc2; vs SharedABC abc1; ABC abc2; even worse, shared propagation won't work this way; struct DEF { ABC abc; } shared(DEF) def; def.abc.a(); and then there's also the druntime issue: core.sync doesn't work with shared which leads to this schizophrenic situation: struct A { Mutex m; void a() //Doesn't compile with shared { m.lock(); //Compiles, but locks on a TLS mutex! m.unlock(); } } struct A { shared Mutex m; shared void a() { m.lock(); //Doesn't compile (cast(Mutex)m).unlock(); //Ugly } } So the only useful solution avoids using shared: struct A { __gshared Mutex m; //Good we have __gshared! shared void a() { m.lock(); m.unlock(); } } And then there are some open questions with advanced use cases: * How do I make sure that a non-shared delegate is only accepted if I have an A, but a shared delegate should be supported for shared(A) and A? (calling a shared delegate from a non-shared function should work, right?) struct A { void a(T)(T v) { writeln("non-shared"); } shared void a(T)(T v) if (isShared!v) //isShared doesn't exist { writeln("shared"); } } And having fun with this little example: http://dpaste.dzfl.pl/7f6a4ad2 * What's the difference between: "void delegate() shared" and "shared(void delegate())"? Error: cannot implicitly convert expression (&a.abc) of type void delegate() shared to shared(void delegate()) * So let's call it void delegate() shared instead: void incrementA(void delegate() shared del) /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes are only valid for non-static member functions
Nov 12 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/12/2012 2:57 AM, Johannes Pfau wrote:
 But there are also shared member functions and they're kind of annoying
 right now:

 * You can't call shared methods from non-shared methods or vice versa.
    This leads to code duplication, you basically have to implement
    everything twice:
You can't get away from the fact that data that can be accessed from multiple threads has to be dealt with in a *fundamentally* different way than single threaded code. You cannot share code between the two. There is simply no conceivable way that "share" can be added and then code will become thread safe. Most of the issues you're having seem to revolve around treating shared data access just like single threaded access, except "share" was added. This cannot work. The compiler error messages, while very annoying, are in their own obscure way pointing this out. It's my fault, I have not explained share very well, and have oversold it. It does not solve concurrency problems, it points them out.
 ----------
 struct ABC
 {
          Mutext mutex;
 	void a()
 	{
 		aImpl();
 	}
 	shared void a()
 	{
 		synchronized(mutex)
 		    aImpl();  //not allowed
 	}
 	private void aImpl()
 	{
 		
 	}
 }
 ----------
 The only way to avoid this is casting away shared in the shared a
 method, but that really is annoying.
As I explained, the way to manipulate shared data is to get exclusive access to it via a mutex, cast away the shared-ness, manipulate it as single threaded data, convert it back to shared, and release the mutex.
 * You can't have data members be included only for the shared version.
    In the above example, the mutex member will always be included, even
    if ABC instance is thread local.

 So you're often better off writing a non-thread safe struct and writing
 a wrapper struct. This way you don't have useless overhead in the
 non-thread safe implementation. But the nice instance syntax is
 lost:

 shared(ABC) abc1; ABC abc2;
 vs
 SharedABC abc1; ABC abc2;

 even worse, shared propagation won't work this way;

 struct DEF
 {
      ABC abc;
 }
 shared(DEF) def;
 def.abc.a();



 and then there's also the druntime issue: core.sync doesn't work with
 shared which leads to this schizophrenic situation:
 struct A
 {
      Mutex m;
      void a() //Doesn't compile with shared
      {
          m.lock();  //Compiles, but locks on a TLS mutex!
          m.unlock();
      }
 }

 struct A
 {
      shared Mutex m;
      shared void a()
      {
          m.lock();  //Doesn't compile
          (cast(Mutex)m).unlock(); //Ugly
      }
 }

 So the only useful solution avoids using shared:
 struct A
 {
      __gshared Mutex m; //Good we have __gshared!
      shared void a()
      {
          m.lock();
          m.unlock();
      }
 }
Yes, mutexes will need to exist in a global space.
 And then there are some open questions with advanced use cases:
 * How do I make sure that a non-shared delegate is only accepted if I
    have an A, but a shared delegate should be supported
    for shared(A) and A? (calling a shared delegate from a non-shared
    function should work, right?)

 struct A
 {
      void a(T)(T v)
      {
          writeln("non-shared");
      }
      shared void a(T)(T v)  if (isShared!v) //isShared doesn't exist
      {
          writeln("shared");
      }
 }
First, you have to decide what you mean by a shared delegate. Do you mean the variable containing the two pointers that make up a delegate are shared, or the delegate is supposed to deal with shared data?
 And having fun with this little example:
 http://dpaste.dzfl.pl/7f6a4ad2

 * What's the difference between: "void delegate() shared"
    and "shared(void delegate())"?

 Error: cannot implicitly convert expression (&a.abc) of type void
 delegate() shared
The delegate deals with shared data.
 to shared(void delegate())
The variable holding the delegate is shared.
 * So let's call it void delegate() shared instead:
 void incrementA(void delegate() shared del)
 /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes
    are only valid for non-static member functions
Nov 12 2012
next sibling parent reply luka8088 <luka8088 owave.net> writes:
If I understood correctly there is no reason why this should not compile ?

import core.sync.mutex;

class MyClass {
   void method () {}
}

void main () {
   auto myObject = new shared(MyClass);
   synchronized (myObject) {
     myObject.method();
   }
}


On 12.11.2012 12:19, Walter Bright wrote:
 On 11/12/2012 2:57 AM, Johannes Pfau wrote:
 But there are also shared member functions and they're kind of annoying
 right now:

 * You can't call shared methods from non-shared methods or vice versa.
 This leads to code duplication, you basically have to implement
 everything twice:
You can't get away from the fact that data that can be accessed from multiple threads has to be dealt with in a *fundamentally* different way than single threaded code. You cannot share code between the two. There is simply no conceivable way that "share" can be added and then code will become thread safe. Most of the issues you're having seem to revolve around treating shared data access just like single threaded access, except "share" was added. This cannot work. The compiler error messages, while very annoying, are in their own obscure way pointing this out. It's my fault, I have not explained share very well, and have oversold it. It does not solve concurrency problems, it points them out.
 ----------
 struct ABC
 {
 Mutext mutex;
 void a()
 {
 aImpl();
 }
 shared void a()
 {
 synchronized(mutex)
 aImpl(); //not allowed
 }
 private void aImpl()
 {

 }
 }
 ----------
 The only way to avoid this is casting away shared in the shared a
 method, but that really is annoying.
As I explained, the way to manipulate shared data is to get exclusive access to it via a mutex, cast away the shared-ness, manipulate it as single threaded data, convert it back to shared, and release the mutex.
 * You can't have data members be included only for the shared version.
 In the above example, the mutex member will always be included, even
 if ABC instance is thread local.

 So you're often better off writing a non-thread safe struct and writing
 a wrapper struct. This way you don't have useless overhead in the
 non-thread safe implementation. But the nice instance syntax is
 lost:

 shared(ABC) abc1; ABC abc2;
 vs
 SharedABC abc1; ABC abc2;

 even worse, shared propagation won't work this way;

 struct DEF
 {
 ABC abc;
 }
 shared(DEF) def;
 def.abc.a();



 and then there's also the druntime issue: core.sync doesn't work with
 shared which leads to this schizophrenic situation:
 struct A
 {
 Mutex m;
 void a() //Doesn't compile with shared
 {
 m.lock(); //Compiles, but locks on a TLS mutex!
 m.unlock();
 }
 }

 struct A
 {
 shared Mutex m;
 shared void a()
 {
 m.lock(); //Doesn't compile
 (cast(Mutex)m).unlock(); //Ugly
 }
 }

 So the only useful solution avoids using shared:
 struct A
 {
 __gshared Mutex m; //Good we have __gshared!
 shared void a()
 {
 m.lock();
 m.unlock();
 }
 }
Yes, mutexes will need to exist in a global space.
 And then there are some open questions with advanced use cases:
 * How do I make sure that a non-shared delegate is only accepted if I
 have an A, but a shared delegate should be supported
 for shared(A) and A? (calling a shared delegate from a non-shared
 function should work, right?)

 struct A
 {
 void a(T)(T v)
 {
 writeln("non-shared");
 }
 shared void a(T)(T v) if (isShared!v) //isShared doesn't exist
 {
 writeln("shared");
 }
 }
First, you have to decide what you mean by a shared delegate. Do you mean the variable containing the two pointers that make up a delegate are shared, or the delegate is supposed to deal with shared data?
 And having fun with this little example:
 http://dpaste.dzfl.pl/7f6a4ad2

 * What's the difference between: "void delegate() shared"
 and "shared(void delegate())"?

 Error: cannot implicitly convert expression (&a.abc) of type void
 delegate() shared
The delegate deals with shared data.
 to shared(void delegate())
The variable holding the delegate is shared.
 * So let's call it void delegate() shared instead:
 void incrementA(void delegate() shared del)
 /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes
 are only valid for non-static member functions
Nov 12 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 12/11/2012 16:00, luka8088 a écrit :
 If I understood correctly there is no reason why this should not compile ?

 import core.sync.mutex;

 class MyClass {
 void method () {}
 }

 void main () {
 auto myObject = new shared(MyClass);
 synchronized (myObject) {
 myObject.method();
 }
 }
D has no ownership, so the compiler can't know what if it is safe to do so or not.
Nov 12 2012
parent luka8088 <luka8088 owave.net> writes:
Here i as wild idea:

//////////

void main () {

   mutex x;
   // mutex is not a type but rather a keyword
   // x is a symbol in order to allow
   // different x in different scopes

   shared(x) int i;
   // ... or maybe use UDA ?
   // mutex x must be locked
   // in order to change i

   synchronized (x) {
     // lock x in a compiler-aware way
     i++;
     // compiler guarantees that i will not
     // be changed outside synchronized(x)
   }

}

//////////

so I tried something similar with current implementation:

//////////

import std.stdio;

void main () {

   shared(int) i1;
   auto m1 = new MyMutex();

   i1.attachMutex(m1);
   // m1 must be locked in order to modify i1
	
   // i1++;
   // should throw a compiler error

   // sharedAccess(i1)++;
   // runtime exception, m1 is not locked

   synchronized (m1) {
     sharedAccess(i1)++;
     // ok, m1 is locked
   }

}

// some generic code

import core.sync.mutex;

class MyMutex : Mutex {
    property bool locked = false;
    trusted void lock () {
     super.lock();
     locked = true;
   }
    trusted void unlock () {
     locked = false;
     super.unlock();
   }
   bool tryLock () {
     bool result = super.tryLock();
     if (result)
       locked = true;
     return result;
   }
}

template unshared (T : shared(T)) {
   alias T unshared;
}

template unshared (T : shared(T)*) {
   alias T* unshared;
}

auto ref sharedAccess (T) (ref T value) {
   assert(value.attachMutex().locked);
   unshared!(T)* refVal = (cast(unshared!(T*)) &value);
   return *refVal;
}

MyMutex attachMutex (T) (T value, MyMutex mutex = null) {
   static __gshared MyMutex[T] mutexes;
   // this memory leak can be solved
   // but it's left like this to make the code simple
   synchronized if (value !in mutexes && mutex !is null)
     mutexes[value] = mutex;
   assert(mutexes[value] !is null);
   return mutexes[value];
}

//////////

and another example with methods:

//////////

import std.stdio;

class a {
   int i;
   void increment () { i++; }
}

void main () {

   auto a1 = new shared(a);
   auto m1 = new MyMutex();

   a1.attachMutex(m1);
   // m1 must be locked in order to modify a1
	
   // a1.increment();
   // compiler error

   // sharedAccess(a1).increment();
   // runtime exception, m1 is not locked

   synchronized (m1) {
     sharedAccess(a1).increment();
     // ok, m1 is locked
   }

}

// some generic code

import core.sync.mutex;

class MyMutex : Mutex {
    property bool locked = false;
    trusted void lock () {
     super.lock();
     locked = true;
   }
    trusted void unlock () {
     locked = false;
     super.unlock();
   }
   bool tryLock () {
     bool result = super.tryLock();
     if (result)
       locked = true;
     return result;
   }
}

template unshared (T : shared(T)) {
   alias T unshared;
}

template unshared (T : shared(T)*) {
   alias T* unshared;
}

auto ref sharedAccess (T) (ref T value) {
   assert(value.attachMutex().locked);
   unshared!(T)* refVal = (cast(unshared!(T*)) &value);
   return *refVal;
}

MyMutex attachMutex (T) (T value, MyMutex mutex = null) {
   static __gshared MyMutex[T] mutexes;
   // this memory leak can be solved
   // but it's left like this to make the code simple
   synchronized if (value !in mutexes && mutex !is null)
     mutexes[value] = mutex;
   assert(mutexes[value] !is null);
   return mutexes[value];
}

//////////

In any case, if shared itself does not provide locking and does not 
fixes problems but only points them out (not to be misunderstood, I 
completely agree with that) then I think that assigning a mutex to the 
variable is a must.

Aldo latter examples already work with current implementation I like the 
first one (or something similar to the first one) more, it looks cleaner 
and leaves space for additional optimizations.


On 12.11.2012 17:14, deadalnix wrote:
 Le 12/11/2012 16:00, luka8088 a écrit :
 If I understood correctly there is no reason why this should not
 compile ?

 import core.sync.mutex;

 class MyClass {
 void method () {}
 }

 void main () {
 auto myObject = new shared(MyClass);
 synchronized (myObject) {
 myObject.method();
 }
 }
D has no ownership, so the compiler can't know what if it is safe to do so or not.
Nov 12 2012
prev sibling parent "Johannes Pfau" <nospam example.com> writes:
On Monday, 12 November 2012 at 11:19:57 UTC, Walter Bright wrote:
 On 11/12/2012 2:57 AM, Johannes Pfau wrote:
 But there are also shared member functions and they're kind of 
 annoying
 right now:

 * You can't call shared methods from non-shared methods or 
 vice versa.
   This leads to code duplication, you basically have to 
 implement
   everything twice:
You can't get away from the fact that data that can be accessed from multiple threads has to be dealt with in a *fundamentally* different way than single threaded code. You cannot share code between the two. There is simply no conceivable way that "share" can be added and then code will become thread safe.
I know share can't automatically make the code thread safe. I just wanted to point out that this casting / code duplication is annoying but I don't know either how this could be solved.
 Yes, mutexes will need to exist in a global space.
I'm not sure if I undestand this. Don't you think shared(Mutex) should work? AFAICS that's only a library problem: Add shared to the lock / unlock methods in druntime and it should work? Or global as in not in the struct instance?
 And then there are some open questions with advanced use cases:
 * How do I make sure that a non-shared delegate is only 
 accepted if I
   have an A, but a shared delegate should be supported
   for shared(A) and A? (calling a shared delegate from a 
 non-shared
   function should work, right?)

 struct A
 {
     void a(T)(T v)
     {
         writeln("non-shared");
     }
     shared void a(T)(T v)  if (isShared!v) //isShared doesn't 
 exist
     {
         writeln("shared");
     }
 }
First, you have to decide what you mean by a shared delegate. Do you mean the variable containing the two pointers that make up a delegate are shared, or the delegate is supposed to deal with shared data?
I'm talking about a delegate pointing to a method declared with the "shared" keyword and the "this pointer" pointing to a shared object: struct A { shared void a(){} } shared A instance; auto del = &instance.a; //I'm talking about this type To explain that usecase: I think of a shared delegate as a delegate that can be safely called from different threads. So I can store it in a struct instance and later on call it from any thread: struct Signal { //The variable is shared _AND_ the method is shared shared(shared void delegate()) _handler; shared void call() //Can be called from any thread { //Would have to synchronize access to the variable in a real world case, //but the call itself wouldn't have to be synchronized shared void delegate() localHandler; synchronized(mutex) { localHandler = _handler; } localHandler (); } }
 And having fun with this little example:
 http://dpaste.dzfl.pl/7f6a4ad2

 * What's the difference between: "void delegate() shared"
   and "shared(void delegate())"?

 Error: cannot implicitly convert expression (&a.abc) of type 
 void
 delegate() shared
The delegate deals with shared data.
OK so that's what I need but the compiler doesn't let me declare that type. alias void delegate() shared del; Error: const/immutable/shared/inout attributes are only valid for non-static member functions
 to shared(void delegate())
The variable holding the delegate is shared.
OK, but when it's used as a function parameter, which is pass-by-value for delegates and because of tail-shared there's effectively no difference, right? In that case it's not possible to pass a shared variable to the function as this will always create a copy? void abcd(shared(void delegate()) del) which is the same as void abcd(shared void delegate() del) How would you pass del as a shared variable?
 * So let's call it void delegate() shared instead:
 void incrementA(void delegate() shared del)
 /home/c684/c922.d(7): Error: const/immutable/shared/inout 
 attributes
   are only valid for non-static member functions
Nov 12 2012
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 12, 2012, at 2:57 AM, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 11 Nov 2012 18:30:17 -0800
 schrieb Walter Bright <newshound2 digitalmars.com>:
=20
=20
 To make a shared type work in an algorithm, you have to:
=20
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
=20
 Also, all op=3D need to be disabled for shared types.
=20 But there are also shared member functions and they're kind of =
annoying
 right now:
=20
 * You can't call shared methods from non-shared methods or vice versa.
  This leads to code duplication, you basically have to implement
  everything twice:
=20
 ----------
 struct ABC
 {
        Mutext mutex;
 	void a()
 	{
 		aImpl();
 	}
 	shared void a()
 	{
 		synchronized(mutex)
 		    aImpl();  //not allowed
 	}
 	private void aImpl()
 	{
 	=09
 	}
 }
 ----------
 The only way to avoid this is casting away shared in the shared a
 method, but that really is annoying.
Yes. You end up having two methods for each function, one as a = synchronized wrapper that casts away shared and another that does the = actual work.
 and then there's also the druntime issue: core.sync doesn't work with
 shared which leads to this schizophrenic situation:
 struct A
 {
    Mutex m;
    void a() //Doesn't compile with shared
    {
        m.lock();  //Compiles, but locks on a TLS mutex!
        m.unlock();
    }
 }
Most of the reason for this was that I didn't like the old implications = of shared, which was that shared methods would at some time in the = future end up with memory barriers all over the place. That's been = dropped, but I'm still not a fan of the wrapper method for each = function. It makes for a crappy class design.=
Nov 14 2012
prev sibling next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright  
<newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 12 2012
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 12 Nov 2012 11:55:51 -0000, Regan Heath <regan netmail.co.nz>  
wrote:
 On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright  
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj.
There was talk a while back about how to handle the existing object mutex and synchronized{} statement blocks and this subject has me thinking back to that. My thinking has gone full circle and rather than bore you with all the details I want to present a conclusion which I am hoping is both implementable and useful. First off, IIRC object contains a mutex/monitor/critical section, which means all objects contain one. The last discussion saw many people wanting this removed for efficiency. I propose we do this. Then, if a class or struct is declared as "shared" or a "shared" instance of a class or struct is constructed we magically include one (compiler magic which I hope is possible). Secondly I say we make "shared" illegal on basic types. This is a limitation(*) but I believe in most cases a single int is unlikely to be shared without an accompanying group of other variables, and usually an algorithm operating on those variables. These variables and the algorithm should be encapsulated in a class or struct - which can in turn be shared. Now.. the synchronized() {} statement can do the magic described above (as ScopedLock) for us. It would be illegal to call it on a non "shared" instance. It would acquire the mutex and cast away "shared" inside the block/scope, at the end of the scope it would cast shared back and release the mutex. (*) for those rare cases where a single int or other basic type is all that is shared we can provide a wrapper struct which is declared as "shared". R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 12 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 12/11/2012 13:25, Regan Heath a écrit :
 First off, IIRC object contains a mutex/monitor/critical section, which
 means all objects contain one. The last discussion saw many people
 wanting this removed for efficiency. I propose we do this. Then, if a
 class or struct is declared as "shared" or a "shared" instance of a
 class or struct is constructed we magically include one (compiler magic
 which I hope is possible).
As already explain in the thread you mention, it is not gonna work. The conclusion of the thread is that only synchronized classes should have one mutex field.
 Secondly I say we make "shared" illegal on basic types. This is a
 limitation(*) but I believe in most cases a single int is unlikely to be
 shared without an accompanying group of other variables, and usually an
 algorithm operating on those variables. These variables and the
 algorithm should be encapsulated in a class or struct - which can in
 turn be shared.
Shared reference counting ? Disruptor ?
 Now.. the synchronized() {} statement can do the magic described above
 (as ScopedLock) for us. It would be illegal to call it on a non "shared"
 instance. It would acquire the mutex and cast away "shared" inside the
 block/scope, at the end of the scope it would cast shared back and
 release the mutex.

 (*) for those rare cases where a single int or other basic type is all
 that is shared we can provide a wrapper struct which is declared as
 "shared".
Nov 12 2012
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-12 12:55, Regan Heath wrote:
 On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj.
I'm just throwing it in here again, AST macros could probably solve this. -- /Jacob Carlborg
Nov 12 2012
parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On 2012-11-12, 15:11, Jacob Carlborg wrote:

 On 2012-11-12 12:55, Regan Heath wrote:
 On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what we actually want, in order to make the above "nice" is a "scoped" struct wrapping the mutex and shared object which does all the "dirty" work for you. I'm thinking.. // (0) with(ScopedLock(obj,lock)) // (1) { obj.foo = 2; // (2) } // (3) // (4) (0) obj is a "shared" reference, lock is a global mutex (1) mutex is acquired here, shared is cast away (2) 'obj' is not "shared" here so data access is allowed (3) ScopedLock is "destroyed" and the mutex released (4) obj is shared again I think most of the above can be done without any compiler support but it would be "nice" if the compiler did something clever with 'obj' such that it knew it wasn't 'shared' inside the the 'with' above. If not, if a full library solution is desired we could always have another temporary "unshared" variable referencing obj.
I'm just throwing it in here again, AST macros could probably solve this.
Until someone writes a proper DIP on them, macros can write entire software packages, download Hitler, turn D into lisp, and bake bread. Can we please stop with the 'macros could do that' until there's any sort of consensus as to what macros *could* do? -- Simen
Nov 12 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-12 17:57, Simen Kjaeraas wrote:

 Until someone writes a proper DIP on them, macros can write entire software
 packages, download Hitler, turn D into lisp, and bake bread. Can we please
 stop with the 'macros could do that' until there's any sort of consensus as
 to what macros *could* do?
Sure, I can try and stop doing that :) -- /Jacob Carlborg
Nov 12 2012
parent FeepingCreature <default_357-line yahoo.de> writes:
On 11/12/12 20:08, Jacob Carlborg wrote:
 On 2012-11-12 17:57, Simen Kjaeraas wrote:
 
 Until someone writes a proper DIP on them, macros can write entire software
 packages, download Hitler, turn D into lisp, and bake bread. Can we please
 stop with the 'macros could do that' until there's any sort of consensus as
 to what macros *could* do?
Sure, I can try and stop doing that :)
You know, AST macros could probably stop doing that. Food for thought.
Nov 13 2012
prev sibling next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 12/11/2012 03:30, Walter Bright a écrit :
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute.
The compiler is able to do some optimization on that, and, it never forget to put a barrier where I would. Some algorithms are safe to use concurrently, granted the right barriers are in place. Think double check locking for instance. This is the very reason why volatile have been modified in Java 1.5 to include barriers. I wish D's shared get a semantic close to java's volatile.
 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random
 runtime corruption.
Agreed.
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.
That is never gonna scale without some kind of ownership of data. Think about slices.
Nov 12 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 12 November 2012 04:30, Walter Bright <newshound2 digitalmars.com> wrote=
:

 On 11/11/2012 10:46 AM, Alex R=C3=B8nne Petersen wrote:

 It's starting to get outright embarrassing to talk to newcomers about D'=
s
 concurrency support because the most fundamental part of it -- the share=
d
 type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single co=
re
 CPUs are all sequentially consistent, and still have major concurrency
 problems. This also means that having templates accept shared(T) as
 arguments and have them magically generate correct concurrent code is a
 pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what they'=
re
 doing for that particular use case, and the compiler inserting them is no=
t
 going to substitute.


 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random runti=
me
 corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op=3D need to be disabled for shared types.
I agree completely the OP, shared is really very unhelpful right now. It just inconveniences you, and forces you to perform explicit casts (which may cast away other attributes like const). I've thought before that what it might be useful+practical for shared to do, is offer convenient methods to implement precisely what you describe above. Imagine a system where tagging a variable 'shared' would cause it to gain some properties: Gain a mutex, implicit var.lock()/release() methods to call on either side of access to your shared variable, and unlike the current situation where assignment is illegal, rather, assignment works as usual, but the shared tag implies a runtime check to verify the item is locked when performing assignment (perhaps that runtime check would be removed in -release for performance). This would make implementing the logic you describe above convenient, and you wouldn't need to be declaring explicit mutexes around the place. It would also address the safety by asserting that it is locked whenever accessed.
Nov 12 2012
prev sibling next sibling parent reply luka8088 <luka8088 owave.net> writes:
On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }
Nov 13 2012
next sibling parent reply "luka8088" <luka8088 owave.net> writes:
On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to 
 newcomers about D's
 concurrency support because the most fundamental part of it 
 -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }
Um, sorry, the following code: ////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }
Nov 13 2012
next sibling parent reply =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 13.11.2012 10:14, schrieb luka8088:
 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }
Um, sorry, the following code: ////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }
Only std.concurrency (using spawn() and send()) enforces that unshared data cannot be pass between threads. The core.thread module is just a low-level module that just represents the OS functionality.
Nov 13 2012
parent reply luka8088 <luka8088 owave.net> writes:
On 13.11.2012 10:20, Sönke Ludwig wrote:
 Am 13.11.2012 10:14, schrieb luka8088:
 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ? ////////// import core.thread; void main () { shared int i; (new Thread({ i++; })).start(); }
Um, sorry, the following code: ////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }
Only std.concurrency (using spawn() and send()) enforces that unshared data cannot be pass between threads. The core.thread module is just a low-level module that just represents the OS functionality.
In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is not a correct guarantee. Or at least that should be noted there. If nothing else it is confusing...
Nov 13 2012
parent "David Nadlinger" <see klickverbot.at> writes:
On Tuesday, 13 November 2012 at 10:06:12 UTC, luka8088 wrote:
 On 13.11.2012 10:20, Sönke Ludwig wrote:
 Only std.concurrency (using spawn() and send()) enforces that 
 unshared data cannot be pass between
 threads. The core.thread module is just a low-level module 
 that just represents the OS functionality.
In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is not a correct guarantee. Or at least that should be noted there. If nothing else it is confusing...
You are right, it could probably be added to avoid confusion. But then, non- safe code is not guaranteed to maintain any type system invariants at all if you don't pay attention to what its requirements are, so memory sharing is not really special in that regard… David
Nov 13 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 13, 2012, at 1:14 AM, luka8088 <luka8088 owave.net> wrote:

 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex R=F8nne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers =
about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
=20 I think a couple things are clear: =20 1. Slapping shared on a type is never going to make algorithms on =
that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they =
do
 nothing for race conditions that are sequentially consistent. =
Remember,
 single core CPUs are all sequentially consistent, and still have =
major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.
=20
 2. The idea of shared adding memory barriers for access is not going =
to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler =
inserting
 them is not going to substitute.
=20
=20
 However, and this is a big however, having shared as =
compiler-enforced
 self-documentation is immensely useful. It flags where and when data =
is
 being shared. So, your algorithm won't compile when you pass it a =
shared
 type? That is because it is NEVER GOING TO WORK with a shared type. =
At
 least you get a compile time indication of this, rather than random
 runtime corruption.
=20
 To make a shared type work in an algorithm, you have to:
=20
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
=20
 Also, all op=3D need to be disabled for shared types.
=20 =20 This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? =20 and also with http://dlang.org/faq.html#shared_guarantees said, I =
come to think that the fact that the following code compiles is either = lack of implementation, a compiler bug or a faq error ?
=20
 //////////
=20
 import core.thread;
=20
 void main () {
  int i;
  (new Thread({ i++; })).start();
 }
It's intentional. core.thread is for people who know what they're = doing, and there are legitimate uses along these lines: void main() { int i; auto t =3D new Thread({i++;}); t.start(); t.join(); write(i); } This is perfectly safe and has a deterministic result.=
Nov 14 2012
parent luka8088 <luka8088 owave.net> writes:
On 14.11.2012 20:54, Sean Kelly wrote:
 On Nov 13, 2012, at 1:14 AM, luka8088<luka8088 owave.net>  wrote:

 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.
I think a couple things are clear: 1. Slapping shared on a type is never going to make algorithms on that type work in a concurrent context, regardless of what is done with memory barriers. Memory barriers ensure sequential consistency, they do nothing for race conditions that are sequentially consistent. Remember, single core CPUs are all sequentially consistent, and still have major concurrency problems. This also means that having templates accept shared(T) as arguments and have them magically generate correct concurrent code is a pipe dream. 2. The idea of shared adding memory barriers for access is not going to ever work. Adding barriers has to be done by someone who knows what they're doing for that particular use case, and the compiler inserting them is not going to substitute. However, and this is a big however, having shared as compiler-enforced self-documentation is immensely useful. It flags where and when data is being shared. So, your algorithm won't compile when you pass it a shared type? That is because it is NEVER GOING TO WORK with a shared type. At least you get a compile time indication of this, rather than random runtime corruption. To make a shared type work in an algorithm, you have to: 1. ensure single threaded access by aquiring a mutex 2. cast away shared 3. operate on the data 4. cast back to shared 5. release the mutex Also, all op= need to be disabled for shared types.
This clarifies a lot, but still a lot of people get confused with: http://dlang.org/faq.html#shared_memory_barriers is it a faq error ? and also with http://dlang.org/faq.html#shared_guarantees said, I come to think that the fact that the following code compiles is either lack of implementation, a compiler bug or a faq error ?
////////// import core.thread; void main () { int i; (new Thread({ i++; })).start(); }
It's intentional. core.thread is for people who know what they're doing, and there are legitimate uses along these lines: void main() { int i; auto t = new Thread({i++;}); t.start(); t.join(); write(i); } This is perfectly safe and has a deterministic result.
Yes, that makes perfect sense... I just wanted to point out the misguidance in FAQ because (at least before this forum thread) there is not much written about shared and you can get a wrong idea from it (at least I did).
Nov 14 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
 and also with http://dlang.org/faq.html#shared_guarantees said, I come to think
 that the fact that the following code compiles is either lack of
implementation,
 a compiler bug or a faq error ?

 //////////

 import core.thread;

 void main () {
    shared int i;
    (new Thread({ i++; })).start();
 }
I think it's a user bug.
Nov 13 2012
next sibling parent "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 13 November 2012 at 21:29:13 UTC, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused 
 with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
FWIW, I'm with you on this one. Memory barriers would not make shared more useful, as they do not solve the issue with concurrency (as you have explained earlier).
Nov 13 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. Andrei
Nov 13 2012
next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu 
wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused 
 with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. Andrei
I'm speaking out of turn, but... I'll flip that around: what would shared do if there were memory barriers? Walter has said previously in this thread that shared is to be used to mark shared data, and disallow any potentially non-thread-safe operations. To use shared data, you need to manually lock it and then cast away the shared temporarily. This can be made more pleasant with library utilities.
Nov 13 2012
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 2:07 PM, Peter Alexander wrote:
 On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. Andrei
I'm speaking out of turn, but... I'll flip that around: what would shared do if there were memory barriers? Walter has said previously in this thread that shared is to be used to mark shared data, and disallow any potentially non-thread-safe operations. To use shared data, you need to manually lock it and then cast away the shared temporarily. This can be made more pleasant with library utilities.
Oh ok, thanks. That does make sense. There's been quite a bit of discussion between Bartosz, Walter, and myself about allowing transparent loads and stores as opposed to defining intrinsics x.load and x.store(y). In C++11 both transparent and implicit are allowed, and an emergent idiom is "already use the explicit versions because they clarify flow and cost". Andrei
Nov 13 2012
prev sibling parent deadalnix <deadalnix gmail.com> writes:
Le 13/11/2012 23:07, Peter Alexander a écrit :
 On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this. Andrei
I'm speaking out of turn, but... I'll flip that around: what would shared do if there were memory barriers? Walter has said previously in this thread that shared is to be used to mark shared data, and disallow any potentially non-thread-safe operations. To use shared data, you need to manually lock it and then cast away the shared temporarily. This can be made more pleasant with library utilities.
It cannot unless some ownership is introduced in D.
Nov 13 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this.
I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient code I also worry that it will lure programmers into a false sense of complacency about shared, that simply adding "shared" to a type will make their concurrent code work. Few seem to realize that adding memory barriers only makes code sequentially consistent, it does *not* eliminate race conditions. It just turns a multicore machine into (logically) a single core one, *not* a single threaded one. But I do see enormous value in shared in that it logically (and rather forcefully) separates thread-local code from multi-thread code. For example, see the post here about adding a destructor to a shared struct, and having it fail to compile. The complaint was along the lines of shared being broken, whereas I viewed it along the lines of shared pointing out a logic problem in the code - what does destroying a struct accessible from multiple threads mean? I think it must be clear that destroying an object can only happen in one thread, i.e. the object must become thread local in order to be destroyed.
Nov 13 2012
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 2:22 PM, Walter Bright wrote:
 On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this.
I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient code
I'm fine with these arguments. We'll need to break current uses of shared then. What you say is that essentially you can't do even this: shared int x; ... x = 4; You'll need to use x.load(4) instead. Just for the record I'm okay with this breakage.
 I also worry that it will lure programmers into a false sense of
 complacency about shared, that simply adding "shared" to a type will
 make their concurrent code work. Few seem to realize that adding memory
 barriers only makes code sequentially consistent, it does *not*
 eliminate race conditions.
It does eliminate all low-level races.
 It just turns a multicore machine into
 (logically) a single core one, *not* a single threaded one.
This is very approximate.
 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.
As long as a cast is required along the way, we can't claim victory. I need to think about that scenario. Andrei
Nov 13 2012
next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Tuesday, 13 November 2012 at 22:33:51 UTC, Andrei Alexandrescu 
wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.load(4) instead.
You mean x.store(4)? Or am I completely misunderstanding your message? David
Nov 13 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 3:07 PM, David Nadlinger wrote:
 On Tuesday, 13 November 2012 at 22:33:51 UTC, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.load(4) instead.
You mean x.store(4)? Or am I completely misunderstanding your message? David
Apologies, yes, store. Andrei
Nov 13 2012
prev sibling next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 On 11/13/12 2:22 PM, Walter Bright wrote:
 On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
Andrei is a proponent of having shared to memory barriers, I disagree with him. We haven't convinced each other yet, so this is a bit up in the air.
Wait, then what would shared do? This is new to me as I've always assumed you and I have the same view on this.
I'm just not convinced that having the compiler add memory barriers: 1. will result in correctly working code, when done by programmers who have only an incomplete understanding of memory barriers, which would be about 99.9% of us. 2. will result in efficient code
I'm fine with these arguments. We'll need to break current uses of shared then. What you say is that essentially you can't do even this: shared int x; ... x = 4; You'll need to use x.load(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store? (I know you meant store.)
 Just for the record I'm okay with this breakage.

 I also worry that it will lure programmers into a false sense of
 complacency about shared, that simply adding "shared" to a type will
 make their concurrent code work. Few seem to realize that adding memory
 barriers only makes code sequentially consistent, it does *not*
 eliminate race conditions.
It does eliminate all low-level races.
 It just turns a multicore machine into
 (logically) a single core one, *not* a single threaded one.
This is very approximate.
 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.
As long as a cast is required along the way, we can't claim victory. I need to think about that scenario. Andrei
-- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store?
Atomic and sequentially consistent. Andrei
Nov 13 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store?
Atomic and sequentially consistent. Andrei
OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegates -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 00:43, Alex Rønne Petersen wrote:
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store?
Atomic and sequentially consistent. Andrei
OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegates
Scratch that, make it this: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * references * function pointers Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 00:48, Alex Rønne Petersen a écrit :
 On 14-11-2012 00:43, Alex Rønne Petersen wrote:
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store?
Atomic and sequentially consistent. Andrei
OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegates
Scratch that, make it this: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * references * function pointers Slices and delegates can't be loaded/stored atomically because very few architectures provide instructions to atomically load/store 16 bytes of data (required on 64-bit; 32-bit would be fine since that's just 8 bytes, but portability is king). This is also why ucent, cent, and real are not included in the list.
That list sound more reasonable.
Nov 13 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.
When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
Nov 13 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.
When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit. (I deliberately talk in terms of bytes here because that's the nomenclature most architecture manuals use from what I've seen.) -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
 On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.
When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit.
Intel does 128-bit atomic load and store, see http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2- -vol-2-manual.html, "4.5 Memory Datum Alignment and Atomicity". Andrei
Nov 13 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 03:02, Andrei Alexandrescu wrote:
 On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
 On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.
When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit.
Intel does 128-bit atomic load and store, see http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html, "4.5 Memory Datum Alignment and Atomicity". Andrei
That's Itanium, though, not x86. Itanium is a fairly high-end, enterprise-class thing, so that's not very surprising. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 11/14/2012 3:05 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 03:02, Andrei Alexandrescu wrote:
 On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
 On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very
 few
 architectures provide instructions to atomically load/store 16
 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and
 real
 are not included in the list.
When I wrote TDPL I looked at the contemporary architectures and it seemed all were or were about to support double-word atomic ops. So the intent is to allow shared delegates and slices. Are there any architectures today that don't support double-word load, store, and CAS? Andrei
I do not know of a single architecture apart from x86 that supports > 8-byte load/store/CAS (and come to think of it, I'm not so sure x86 actually can do 16-byte load/store, only CAS). So while a shared delegate is doable in 32-bit, it isn't really in 64-bit.
Intel does 128-bit atomic load and store, see http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html, "4.5 Memory Datum Alignment and Atomicity". Andrei
That's Itanium, though, not x86. Itanium is a fairly high-end, enterprise-class thing, so that's not very surprising.
On x86 you can use LOCK CMPXCHG16b to do the atomic read: http://stackoverflow.com/questions/9726566/atomic-16-byte-read-on-x64-cpus This just excludes a small number of early AMD processors.
Nov 14 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 00:43, Alex Rønne Petersen a écrit :
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store?
Atomic and sequentially consistent. Andrei
OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegates
I wouldn't expected it to work for delegates, long, ulong, double and slice on every arch. If it does work, that is awesome, and add to my determination that this is the thing to do.
Nov 13 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 01:09, deadalnix wrote:
 Le 14/11/2012 00:43, Alex Rønne Petersen a écrit :
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.
Is that meant to be an atomic store, or just a regular, but explicit, store?
Atomic and sequentially consistent. Andrei
OK, but then we have the problem I presented in the OP: This only works for certain types, on certain architectures, for certain processors, ... So, we could limit shared load/store to only work on certain types and require all architectures that D compilers target to provide those. *But* this means that shared on any non-primitive types becomes essentially useless and will in 99% of cases just be casted away. On the other hand, if we make it implementation-defined, people end up writing highly unportable code. So, (unless anyone can come up with better alternatives), I think guaranteeing atomic load/store for a certain set of types is the most sensible way forward. FWIW, these are the types and type categories I'd expect shared load/store to work on, on any architecture: * ubyte, byte * ushort, short * uint, int * ulong, long * float, double * pointers * slices * references * function pointers * delegates
I wouldn't expected it to work for delegates, long, ulong, double and slice on every arch. If it does work, that is awesome, and add to my determination that this is the thing to do.
8-byte atomic loads/stores is doable on all major architectures. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 5:33 PM, Alex Rønne Petersen wrote:
 8-byte atomic loads/stores is doable on all major architectures.
We're looking at 128-bit load, store, and CAS for 64-bit machines. Andrei
Nov 13 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates
Not going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
Nov 13 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 02:33, Walter Bright wrote:
 On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared
 load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates
Not going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
I amended that (see my other post). 8-byte loads/stores can be done atomically on all relevant architectures today. Andrei linked a page a while back that explained how to do it on x86, ARM, MIPS, and PowerPC (if memory serves), but I can't seem to find it again... -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 02:36, Alex Rønne Petersen a écrit :
 On 14-11-2012 02:33, Walter Bright wrote:
 On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared
 load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates
Not going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
I amended that (see my other post). 8-byte loads/stores can be done atomically on all relevant architectures today. Andrei linked a page a while back that explained how to do it on x86, ARM, MIPS, and PowerPC (if memory serves), but I can't seem to find it again...
http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
Nov 13 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 03:00, deadalnix wrote:
 Le 14/11/2012 02:36, Alex Rønne Petersen a écrit :
 On 14-11-2012 02:33, Walter Bright wrote:
 On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared
 load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates
Not going to portably work on long, ulong, double, slices, or delegates. (The compiler should issue an error where it won't work, and allow it where it does, letting the user decide what to do about the non-working cases.)
I amended that (see my other post). 8-byte loads/stores can be done atomically on all relevant architectures today. Andrei linked a page a while back that explained how to do it on x86, ARM, MIPS, and PowerPC (if memory serves), but I can't seem to find it again...
http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
Thanks, exactly that. No MIPS, though. I guess I'm going to have to go dig through their manuals. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 13 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 2:33 PM, Andrei Alexandrescu wrote:
 As long as a cast is required along the way, we can't claim victory. I need to
 think about that scenario.
Our car doesn't have an electric starter yet, but it's still better than a horse :-)
Nov 13 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 5:29 PM, Walter Bright wrote:
 On 11/13/2012 2:33 PM, Andrei Alexandrescu wrote:
 As long as a cast is required along the way, we can't claim victory. I
 need to
 think about that scenario.
Our car doesn't have an electric starter yet, but it's still better than a horse :-)
Please don't. This is "we're doing better than C++" in disguise and exactly the wrong frame of mind. I find few things more negatively disruptive than lulling into a false sense of achievement. Andrei
Nov 13 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, November 13, 2012 14:33:50 Andrei Alexandrescu wrote:
 As long as a cast is required along the way, we can't claim victory. I
 need to think about that scenario.
At this point, I don't see how it could be otherwise. Having the shared equivalent of const would just lead to that being used everywhere and defeat the purpose of shared in the first place. If it's not segregated, it's not doing its job. But that leaves us with most functions not working with shared, which is also a problem. Templates are a partial solution, but they obviously don't work for everything. In general, I would expect that all uses of shared would be protected by a mutex or synchronized block or other similar construct. It's just going to cause problems to do otherwise. There are some cases where if you can guarantee that writes and reads are atomic, you're fine skipping the mutexes, but those are relatively rare, particularly when you consider the issues in making anything but extremely trivial writes or reads atomic. That being the case, it doesn't really seem all that unreasonable to me for it to be normal to have to cast shared to non-shared to pass to functions as long as all of that code is protected with a mutex or another, similar construct - though if those functions aren't pure, you _could_ run into entertaining problems when a non-shared reference to the data gets wiled away somewhere in those function calls. But we seem to have contradictory requirements here of trying to segregate shared from normal, thread-local stuff but are still looking to be able to use shared with functions intended to be used with non-shared stuff. - Jonathan M Davis
Nov 13 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 13/11/2012 23:22, Walter Bright a écrit :
 I'm just not convinced that having the compiler add memory barriers:

 1. will result in correctly working code, when done by programmers who
 have only an incomplete understanding of memory barriers, which would be
 about 99.9% of us.

 2. will result in efficient code

 I also worry that it will lure programmers into a false sense of
 complacency about shared, that simply adding "shared" to a type will
 make their concurrent code work. Few seem to realize that adding memory
 barriers only makes code sequentially consistent, it does *not*
 eliminate race conditions. It just turns a multicore machine into
 (logically) a single core one, *not* a single threaded one.
That is what java's volatile do. It have several uses cases, including valid double check locking (It has to be noted that this idiom is used incorrectly in druntime ATM, which proves both its usefullness and that it require language support) and disruptor which I wanted to implement for message passing in D but couldn't because of lack of support at the time. See: http://www.slideshare.net/trishagee/introduction-to-the-disruptor So sequentially consistent read/write are usefull.
 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.
language multithread, have everything shared and still are able to have finalizer of some sort. Why couldn't a shared object be destroyed ? Why should it be destroyed in a specific thread as it can only refer shared data because of transitivity ?
Nov 13 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 4:04 PM, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including valid
 double check locking (It has to be noted that this idiom is used incorrectly in
 druntime ATM,
Please, please file a bug report about this, rather than a vague statement here. If there already is one, please post its number.
 So sequentially consistent read/write are usefull.
Sure, I agree with that.

 multithread, have everything shared and still are able to have finalizer of
some
 sort.
I understand, though, that they take steps to ensure that the finalizer is run in one thread and no other thread still has access to it - i.e. it is converted back to a local reference.
 Why couldn't a shared object be destroyed ? Why should it be destroyed in a
 specific thread as it can only refer shared data because of transitivity ?
How can you destroy an object in one thread when another thread holding live references to it? (Well, how can you destroy it without causing corruption bugs, that is.)
Nov 13 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 02:39, Walter Bright a écrit :
 On 11/13/2012 4:04 PM, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid
 double check locking (It has to be noted that this idiom is used
 incorrectly in
 druntime ATM,
Please, please file a bug report about this, rather than a vague statement here. If there already is one, please post its number.
http://d.puremagic.com/issues/show_bug.cgi?id=6607
 So sequentially consistent read/write are usefull.
Sure, I agree with that.

 language
 multithread, have everything shared and still are able to have
 finalizer of some
 sort.
I understand, though, that they take steps to ensure that the finalizer is run in one thread and no other thread still has access to it - i.e. it is converted back to a local reference.
 Why couldn't a shared object be destroyed ? Why should it be destroyed
 in a
 specific thread as it can only refer shared data because of
 transitivity ?
How can you destroy an object in one thread when another thread holding live references to it? (Well, how can you destroy it without causing corruption bugs, that is.)
Why would you destroy something that isn't dead yet ?
Nov 13 2012
prev sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, 
 including valid double check locking (It has to be noted that 
 this idiom is used incorrectly in druntime ATM, which proves 
 both its usefullness and that it require language support) and 
 disruptor which I wanted to implement for message passing in D 
 but couldn't because of lack of support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
Nov 14 2012
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 13:23, David Nadlinger a écrit :
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
It is a solution now (it wasn't at the time). The main drawback with that solution is that the compiler can't optimize thread local read/write regardless of shared read/write. This is wasted opportunity.
Nov 14 2012
parent "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 13:19:12 UTC, deadalnix wrote:
 The main drawback with that solution is that the compiler can't 
 optimize thread local read/write regardless of shared 
 read/write. This is wasted opportunity.
You mean moving non-atomic loads/stores across atomic instructions? This is simply a matter of the compiler providing the right intrinsics for implementing the core.atomic functions. LDC already does it. David
Nov 14 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume). Andrei
Nov 14 2012
next sibling parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 15:32, Andrei Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume). Andrei
They already work as they should: * DMD: They use inline asm, so they're guaranteed to not be reordered. Calls aren't reordered with DMD either, so even if the former wasn't the case, it'd still work. * GDC: They map directly to the GCC __sync_* builtins, which have the semantics you describe (with full sequential consistency). * LDC: They map to LLVM's load/store instructions with the atomic flag set and with the given atomic consistency, which have the semantics you describe. I don't think there's anything that actually needs to be fixed there. -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 7:11 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:32, Andrei Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume). Andrei
They already work as they should: * DMD: They use inline asm, so they're guaranteed to not be reordered. Calls aren't reordered with DMD either, so even if the former wasn't the case, it'd still work. * GDC: They map directly to the GCC __sync_* builtins, which have the semantics you describe (with full sequential consistency). * LDC: They map to LLVM's load/store instructions with the atomic flag set and with the given atomic consistency, which have the semantics you describe. I don't think there's anything that actually needs to be fixed there.
The language definition should be made clear so as future optimizations of existing implementations, and future implementations, don't push things over the limit. Andrei
Nov 14 2012
prev sibling next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei 
Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix 
 wrote:
 That is what java's volatile do. It have several uses cases, 
 including
 valid double check locking (It has to be noted that this 
 idiom is used
 incorrectly in druntime ATM, which proves both its 
 usefullness and
 that it require language support) and disruptor which I 
 wanted to
 implement for message passing in D but couldn't because of 
 lack of
 support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).
Sorry, I don't quite see where I simplified things. Yes, in the implementation of atomicLoad/atomicStore, one would probably use compiler intrinsics, as done in LDC's druntime, or inline assembly, as done for DMD. But an optimizer will never move instructions across opaque function calls, because they could have arbitrary side effects. So, either we are fine by definition, or if the compiler inlines the atomicLoad/atomicStore calls (which is actually possible in LDC), then its optimizer will detect the presence of inline assembly resp. the load/store intrinsics, and take care of not reordering the instructions in an invalid way. I don't see how this makes my answer to deadalnix (that »volatile« is not necessary to implement sequentially consistent loads/stores) any less valid. David
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 8:59 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.
What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't know whether there might be a weird spec loophole which could theoretically lead to them being undefined behavior, but I'm sure that they are guaranteed to produce the right code on all relevant compilers. You can even specify the memory order semantics if you know what you are doing (although this used to trigger a template resolution bug in the frontend, no idea if it works now). David
This is a simplification of what should be going on. The core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the compiler generate sequentially consistent code with them (i.e. not perform certain reorderings). Then there are loads and stores with weaker consistency semantics (acquire, release, acquire/release, and consume).
Sorry, I don't quite see where I simplified things.
First, there are more kinds of atomic loads and stores. Then, the fact that the calls are not supposed to be reordered must be a guarantee of the language, not a speculation about an implementation. We can't argue that a feature works just because it so happens an implementation works a specific way.
 Yes, in the
 implementation of atomicLoad/atomicStore, one would probably use
 compiler intrinsics, as done in LDC's druntime, or inline assembly, as
 done for DMD.

 But an optimizer will never move instructions across opaque function
 calls, because they could have arbitrary side effects.
Nowhere in the language definition is explained what an opaque function call is and what optimizations can and cannot be done in the presence of such.
 So, either we are
 fine by definition,
s/definition/happenstance/
 or if the compiler inlines the
 atomicLoad/atomicStore calls (which is actually possible in LDC), then
 its optimizer will detect the presence of inline assembly resp. the
 load/store intrinsics, and take care of not reordering the instructions
 in an invalid way.

 I don't see how this makes my answer to deadalnix (that »volatile« is
 not necessary to implement sequentially consistent loads/stores) any
 less valid.
Using load/store everywhere would make volatile unneeded (and for us, shared). But the advantage there is that you qualify the type/value once and then you don't need to remember to only use specific primitives to manipulate it. Andrei
Nov 14 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:
=20
 First, there are more kinds of atomic loads and stores. Then, the fact =
that the calls are not supposed to be reordered must be a guarantee of = the language, not a speculation about an implementation. We can't argue = that a feature works just because it so happens an implementation works = a specific way. I've always been a fan of release consistency, and it dovetails well = with the behavior of mutexes = (http://en.wikipedia.org/wiki/Release_consistency). It would be cool if = we could sort out transactional memory as well, but that's not a short = term thing.=
Nov 14 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 12:04 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 9:50 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 First, there are more kinds of atomic loads and stores. Then, the fact that
the calls are not supposed to be reordered must be a guarantee of the language,
not a speculation about an implementation. We can't argue that a feature works
just because it so happens an implementation works a specific way.
I've always been a fan of release consistency, and it dovetails well with the behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency). It would be cool if we could sort out transactional memory as well, but that's not a short term thing.
I think we should focus on sequential consistency as that's where the industry is converging. Andrei
Nov 14 2012
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:
=20
 This is a simplification of what should be going on. The =
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so = the compiler generate sequentially consistent code with them (i.e. not = perform certain reorderings). Then there are loads and stores with = weaker consistency semantics (acquire, release, acquire/release, and = consume). No. These functions all contain volatile ask blocks. If the compiler = respected the "volatile" it would be enough.=
Nov 14 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 21:01, Sean Kelly a écrit :
 On Nov 14, 2012, at 6:32 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 This is a simplification of what should be going on. The
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the
compiler generate sequentially consistent code with them (i.e. not perform
certain reorderings). Then there are loads and stores with weaker consistency
semantics (acquire, release, acquire/release, and consume).
No. These functions all contain volatile ask blocks. If the compiler respected the "volatile" it would be enough.
It is sufficient for monocore and mostly correct for x86. But isn't enough. volatile isn't for concurency, but memory mapping.
Nov 15 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 4:54 AM, deadalnix <deadalnix gmail.com> wrote:

 Le 14/11/2012 21:01, Sean Kelly a =E9crit :
 On Nov 14, 2012, at 6:32 AM, Andrei =
Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:
=20
 This is a simplification of what should be going on. The =
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so = the compiler generate sequentially consistent code with them (i.e. not = perform certain reorderings). Then there are loads and stores with = weaker consistency semantics (acquire, release, acquire/release, and = consume).
=20
 No.  These functions all contain volatile ask blocks.  If the =
compiler respected the "volatile" it would be enough.
=20
 It is sufficient for monocore and mostly correct for x86. But isn't =
enough.
=20
 volatile isn't for concurency, but memory mapping.
Traditionally, the term "volatile" is for memory mapping. The = description of "volatile" for D1, though, would have worked for = concurrency. Or is there some example you can provide where this isn't = true?=
Nov 15 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 15/11/2012 17:33, Sean Kelly a écrit :
 On Nov 15, 2012, at 4:54 AM, deadalnix<deadalnix gmail.com>  wrote:

 Le 14/11/2012 21:01, Sean Kelly a écrit :
 On Nov 14, 2012, at 6:32 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>   wrote:
 This is a simplification of what should be going on. The
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the
compiler generate sequentially consistent code with them (i.e. not perform
certain reorderings). Then there are loads and stores with weaker consistency
semantics (acquire, release, acquire/release, and consume).
No. These functions all contain volatile ask blocks. If the compiler respected the "volatile" it would be enough.
It is sufficient for monocore and mostly correct for x86. But isn't enough. volatile isn't for concurency, but memory mapping.
Traditionally, the term "volatile" is for memory mapping. The description of "volatile" for D1, though, would have worked for concurrency. Or is there some example you can provide where this isn't true?
I'm not aware of D1 compiler inserting memory barrier, so any memory operation reordering done by the CPU would have screwed up.
Nov 16 2012
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 12:01 PM, Sean Kelly <sean invisibleduck.org> wrote:

 On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:
=20
 This is a simplification of what should be going on. The =
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so = the compiler generate sequentially consistent code with them (i.e. not = perform certain reorderings). Then there are loads and stores with = weaker consistency semantics (acquire, release, acquire/release, and = consume).
=20
 No.  These functions all contain volatile ask blocks.  If the compiler =
respected the "volatile" it would be enough. asm blocks. Darn auto-correct.=
Nov 14 2012
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-13 23:22, Walter Bright wrote:

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.
If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough? -- /Jacob Carlborg
Nov 13 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
Nov 14 2012
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.
Is there then any real advantage of having it directly in the language? -- /Jacob Carlborg
Nov 14 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/14/2012 1:31 AM, Jacob Carlborg wrote:
 On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.
Is there then any real advantage of having it directly in the language?
Not that I can think of.
Nov 14 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-14 11:38, Walter Bright wrote:

 Not that I can think of.
Then we might want to remove it since it's either not working or basically everyone has misunderstood how it should work. -- /Jacob Carlborg
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 4:47 AM, Jacob Carlborg wrote:
 On 2012-11-14 11:38, Walter Bright wrote:

 Not that I can think of.
Then we might want to remove it since it's either not working or basically everyone has misunderstood how it should work.
Actually this hypothesis is false. Andrei
Nov 14 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-14 15:33, Andrei Alexandrescu wrote:

 Actually this hypothesis is false.
That we should remove it or that it's not working/nobody understands what it should do? If it's the latter then this thread is the evidence that my hypothesis is true. -- /Jacob Carlborg
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 7:14 AM, Jacob Carlborg wrote:
 On 2012-11-14 15:33, Andrei Alexandrescu wrote:

 Actually this hypothesis is false.
That we should remove it or that it's not working/nobody understands what it should do? If it's the latter then this thread is the evidence that my hypothesis is true.
The hypothesis that atomic primitives can be implemented as a library. Andrei
Nov 14 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-11-14 18:36, Andrei Alexandrescu wrote:

 The hypothesis that atomic primitives can be implemented as a library.
I don't know these kind of things, that's why I'm asking. -- /Jacob Carlborg
Nov 14 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 10:31, Jacob Carlborg a écrit :
 On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.
Is there then any real advantage of having it directly in the language?
The compiler can do more reordering in regard to barriers. For instance, the compiler may reorder thread local read write accross the barrier. This can't be done with a library solution.
Nov 14 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-11-14 12:04, deadalnix wrote:

 The compiler can do more reordering in regard to barriers. For instance,
 the compiler may reorder thread local read write accross the barrier.

 This can't be done with a library solution.
I see. -- /Jacob Carlborg
Nov 14 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 1:31 AM, Jacob Carlborg wrote:
 On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.
Is there then any real advantage of having it directly in the language?
It's not an advantage, it's a necessity. Andrei
Nov 14 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-14 15:22, Andrei Alexandrescu wrote:

 It's not an advantage, it's a necessity.
Walter seems to indicate that there is no technical reason for "shared" to be part of the language. I don't know how these memory barriers work, that's why I'm asking. Does it need to be in the language or not? -- /Jacob Carlborg
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 7:16 AM, Jacob Carlborg wrote:
 On 2012-11-14 15:22, Andrei Alexandrescu wrote:

 It's not an advantage, it's a necessity.
Walter seems to indicate that there is no technical reason for "shared" to be part of the language.
Walter is a self-confessed dilettante in threading. To be frank I hope he asks more and answers less in this thread.
 I don't know how these memory barriers work,
 that's why I'm asking. Does it need to be in the language or not?
Memory ordering must be built into the language and understood by the compiler. Andrei
Nov 14 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-11-14 18:40, Andrei Alexandrescu wrote:

 Memory ordering must be built into the language and understood by the
 compiler.
Ok, thanks for the expatiation. -- /Jacob Carlborg
Nov 14 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier. Andrei
Nov 14 2012
next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei 
Alexandrescu wrote:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is 
 there a
 reason for
 having it built into the language? Can a library solution be 
 enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for »shared« are required due to this (and it is »shared« we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly »built into the language«. David
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 9:15 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for »shared« are required due to this (and it is »shared« we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly »built into the language«.
Compiler intrinsics ====== built into the language. Andrei
Nov 14 2012
parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 14 November 2012 17:50, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 On 11/14/12 9:15 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrot=
e:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for =BBshared=AB are required due to t=
his
 (and it is =BBshared=AB we are discussing here!).

 Simply having compiler intrinsics for atomic loads/stores is enough,
 which is hardly =BBbuilt into the language=AB.
Compiler intrinsics =3D=3D=3D=3D=3D=3D built into the language. Andrei
Not necessarily. For example, printf is a compiler intrinsic for GDC, but it's not built into the language in the sense of the compiler *provides* the codegen for it. Though it is aware of what it is and what it does, so can perform relevant optimisations around the use of it. Regards, --=20 Iain Buclaw *(p < e ? p++ : p) =3D (c & 0x0f) + '0';
Nov 14 2012
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 11:21 AM, Iain Buclaw wrote:
 On 14 November 2012 17:50, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 On 11/14/12 9:15 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
Again, this is true, but it would be a fallacy to conclude that compiler-inserted memory barriers for »shared« are required due to this (and it is »shared« we are discussing here!). Simply having compiler intrinsics for atomic loads/stores is enough, which is hardly »built into the language«.
Compiler intrinsics ====== built into the language. Andrei
Not necessarily. For example, printf is a compiler intrinsic for GDC, but it's not built into the language in the sense of the compiler *provides* the codegen for it. Though it is aware of what it is and what it does, so can perform relevant optimisations around the use of it.
aware of what it is and what it does ====== built into the language. Andrei
Nov 14 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
=20 Memory barriers can certainly be added using library functions.
=20 The compiler must understand the semantics of barriers such as e.g. it =
doesn't hoist code above an acquire barrier or below a release barrier. That was the point of the now deprecated "volatile" statement. I still = don't entirely understand why it was deprecated.=
Nov 14 2012
next sibling parent reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 14-11-2012 21:00, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.
The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example. See also: http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP20 -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 12:07 PM, Alex R=F8nne Petersen <alex lycus.org> =
wrote:

 On 14-11-2012 21:00, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:
=20
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there =
a
 reason for
 having it built into the language? Can a library solution be =
enough?
=20
 Memory barriers can certainly be added using library functions.
=20 The compiler must understand the semantics of barriers such as e.g. =
it doesn't hoist code above an acquire barrier or below a release = barrier.
=20
 That was the point of the now deprecated "volatile" statement.  I =
still don't entirely understand why it was deprecated.
=20
=20 The volatile statement was too general. All relevant compiler back =
ends today only know of two kinds of volatile operations: Loads and = stores. Volatile statements couldn't ever be properly implemented in GDC = and LDC for example. Well, the semantics of volatile are that there's an acquire barrier = before the statement block and a release barrier after the statement = block. Or for a first cut just insert a full barrier at the beginning = and end of the block. Either way, it should be pretty simply for a = compiler to handle if the compiler supports mutex use. I do like the idea of built-in load and store intrinsics only because D = only supports x86 assembler right now. But really, it would be just as = easy to fan out a D template function to a bunch of C functions = implemented in separate ASM code files. Druntime actually had this for = core.atomic on PPC until not too long ago.=
Nov 14 2012
parent =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:
On 14-11-2012 21:15, Sean Kelly wrote:
 On Nov 14, 2012, at 12:07 PM, Alex Rønne Petersen <alex lycus.org> wrote:

 On 14-11-2012 21:00, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.
The volatile statement was too general. All relevant compiler back ends today only know of two kinds of volatile operations: Loads and stores. Volatile statements couldn't ever be properly implemented in GDC and LDC for example.
Well, the semantics of volatile are that there's an acquire barrier before the statement block and a release barrier after the statement block. Or for a first cut just insert a full barrier at the beginning and end of the block. Either way, it should be pretty simply for a compiler to handle if the compiler supports mutex use. I do like the idea of built-in load and store intrinsics only because D only supports x86 assembler right now. But really, it would be just as easy to fan out a D template function to a bunch of C functions implemented in separate ASM code files. Druntime actually had this for core.atomic on PPC until not too long ago.
Well, there's not much point in that when all compilers have intrinsics anyway (e.g. GDC has __sync_* and __atomic_* and LDC has some intrinsics in ldc.intrinsics that map to certain LLVM instructions). -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.
Because it's better to associate volatility with data than with code. Andrei
Nov 14 2012
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 2:21 PM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei =
Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:
=20
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there =
a
 reason for
 having it built into the language? Can a library solution be =
enough?
=20
 Memory barriers can certainly be added using library functions.
=20 The compiler must understand the semantics of barriers such as e.g. =
it doesn't hoist code above an acquire barrier or below a release = barrier.
=20
 That was the point of the now deprecated "volatile" statement.  I =
still don't entirely understand why it was deprecated.
=20
 Because it's better to associate volatility with data than with code.
Fair enough. Though this may mean building a bunch of different forms = of volatility into the language. I always saw "volatile" as a library = tool anyway, so while making it code-related was a bit weird, it was a = sufficient tool for the job.
Nov 14 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 23:21, Andrei Alexandrescu a écrit :
 On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?
Memory barriers can certainly be added using library functions.
The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
That was the point of the now deprecated "volatile" statement. I still don't entirely understand why it was deprecated.
Because it's better to associate volatility with data than with code.
Happy to see I'm not alone on that one. Plus, volatile and sequential consistency are 2 different beast. Volatile means no register promotion and no load/store reordering. It is required, but not sufficient for concurrency.
Nov 15 2012
parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 5:10 AM, deadalnix <deadalnix gmail.com> wrote:

 Le 14/11/2012 23:21, Andrei Alexandrescu a =E9crit :
 On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:
=20
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is =
there a
 reason for
 having it built into the language? Can a library solution be =
enough?
=20
 Memory barriers can certainly be added using library functions.
=20 The compiler must understand the semantics of barriers such as e.g. it doesn't hoist code above an acquire barrier or below a release barrier.
=20 That was the point of the now deprecated "volatile" statement. I =
still
 don't entirely understand why it was deprecated.
=20 Because it's better to associate volatility with data than with code.
=20 Happy to see I'm not alone on that one. =20 Plus, volatile and sequential consistency are 2 different beast. =
Volatile means no register promotion and no load/store reordering. It is = required, but not sufficient for concurrency. It's sufficient for concurrency when coupled with library code that does = the hardware-level synchronization. In short, a program has two = separate machines doing similar optimizations on it: the compiler and = the CPU. In D we can use ASM to control CPU optimizations, and in D1 we = had "volatile" to control compiler optimizations. "volatile" was the = minimum required for handling the compiler portion and was easy to get = wrong, but it used only one keyword and I suspect was relatively easy to = implement on the compiler side as well.=
Nov 15 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/13/12 11:37 PM, Jacob Carlborg wrote:
 On 2012-11-13 23:22, Walter Bright wrote:

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.
If the compiler should/does not add memory barriers, then is there a reason for having it built into the language? Can a library solution be enough?
The compiler must be in this so as to not do certain reorderings. Andrei
Nov 14 2012
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, November 13, 2012 14:22:07 Walter Bright wrote:
 I'm just not convinced that having the compiler add memory barriers:
 
 1. will result in correctly working code, when done by programmers who have
 only an incomplete understanding of memory barriers, which would be about
 99.9% of us.
 
 2. will result in efficient code
Being able to have double-checked locking work would be valuable, and having memory barriers would reduce race condition weirdness when locks aren't used properly, so I think that it would be desirable to have memory barriers. If there's a major performance penalty though, that might be a reason not to do it. Certainly, I don't think that there's any question that adding memory barriers won't make it so that you don't need mutexes or synchronized blocks or whatnot. shared's primary benefit is in logically separating normal code from code that must shared data across threads and making it possible for the compiler to optimize based on the fact that it knows that a variable is thread-local. - Jonathan M Davis
Nov 13 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and having
 memory barriers would reduce race condition weirdness when locks aren't used
 properly, so I think that it would be desirable to have memory barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
Nov 14 2012
parent reply =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence? Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining. (And note that you can't optimize this either; since the dependencies memory barriers are supposed to express are subtle and not detectable by a compiler, the compiler would always have to insert them because it can't know when it would be safe not to.) -- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 15:39, Alex Rønne Petersen a écrit :
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence? Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.
In fact, x86 is mostly sequentially consistent due to its memory model. It only require an mfence when an shared store is followed by a shared load. See : http://g.oswego.edu/dl/jmm/cookbook.html for more information on the barrier required on different architectures.
 (And note that you can't optimize this either; since the dependencies
 memory barriers are supposed to express are subtle and not detectable by
 a compiler, the compiler would always have to insert them because it
 can't know when it would be safe not to.)
Compiler is aware of what is thread local and what isn't. It means the compiler can fully optimize TL store and load (like doing register promotion or reorder them across shared store/load). This have a cost, indeed, but is useful, and Walter's solution to cast away shared when a mutex is acquired is always available.
Nov 14 2012
parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 15:50, deadalnix wrote:
 Le 14/11/2012 15:39, Alex Rønne Petersen a écrit :
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence? Because as Walter said, inserting those blindly when unnecessary can lead to terrible performance because it practically murders pipelining.
In fact, x86 is mostly sequentially consistent due to its memory model. It only require an mfence when an shared store is followed by a shared load.
I just used x86's fencing instructions as an example because most people here are familiar with it. The problem is much, much bigger on architectures like ARM, MIPS, and PowerPC which are not in-order.
 See : http://g.oswego.edu/dl/jmm/cookbook.html for more information on
 the barrier required on different architectures.

 (And note that you can't optimize this either; since the dependencies
 memory barriers are supposed to express are subtle and not detectable by
 a compiler, the compiler would always have to insert them because it
 can't know when it would be safe not to.)
Compiler is aware of what is thread local and what isn't. It means the compiler can fully optimize TL store and load (like doing register promotion or reorder them across shared store/load).
Thread-local loads and stores are not atomic and thus do not take part in the reordering constraints that atomic operations impose. See e.g. the LLVM docs for atomicrmw and atomic load/store.
 This have a cost, indeed, but is useful, and Walter's solution to cast
 away shared when a mutex is acquired is always available.
-- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?
Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.
 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.
I think at this point we need to develop a better understanding of what's going on before issuing assessments. Andrei
Nov 14 2012
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:
On 14-11-2012 16:08, Andrei Alexandrescu wrote:
 On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?
Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.
Let's continue this part of the discussion in my other reply (the one explaining how core.atomic is implemented in the various compilers).
 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.
I think at this point we need to develop a better understanding of what's going on before issuing assessments.
I dunno. On low-end architectures like ARM the out-of-order processing is pretty much what makes them usable at all because they don't have the raw power that x86 does (I even recall an ARM Holdings executive saying that they couldn't possibly switch to a strong memory model with an in-order pipeline without severely reducing the efficiency of ARM). So I'm just putting that out there - it's definitely worth taking into consideration because very few architectures are actually fully in-order like x86.
 Andrei
-- Alex Rønne Petersen alex lycus.org http://lycus.org
Nov 14 2012
prev sibling next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei 
Alexandrescu wrote:
 Sorry, I was imprecise. We need to (a) define intrinsics for 
 loading and storing data with high-level semantics (a short 
 list: acquire, release, acquire+release, and 
 sequentially-consistent) and THEN (b) implement the needed code 
 generation appropriately for each architecture. Indeed on x86 
 there is little need to insert fence instructions, BUT there is 
 a definite need for the compiler to prevent certain 
 reorderings. That's why implementing shared data operations 
 (whether implicit or explicit) as sheer library code is NOT 
 possible.
Sorry, I didn't see this message of yours before replying (the perils of threaded news readers…). You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion). Thus, »we«, meaning on a language level, don't need to change anything about the current situations, with the possible exception of adding finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the duty of the compiler writers to provide the appropriate means to implement druntime on their code generation infrastructure – and indeed, the situation in DMD could be improved, using inline asm is hitting a fly with a sledgehammer. David [1] I am not sure where the point of diminishing returns is here, although it might make sense to provide the same options as C++11. If I remember correctly, D1/Tango supported a lot more levels of synchronization.
Nov 14 2012
next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 17:31:07 UTC, David Nadlinger 
wrote:
 Thus, »we«, meaning on a language level, don't need to change 
 anything about the current situations, […]
Let my clarify that: We don't necessarily need to tuck on any extra semantics to the language other than what we currently have. However, what we must indeed do is clarifying/specifying the implicit consensus on which the current implementations are built. We really need a »The D Memory Model«-style document. David
Nov 14 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 9:31 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:
 Sorry, I was imprecise. We need to (a) define intrinsics for loading
 and storing data with high-level semantics (a short list: acquire,
 release, acquire+release, and sequentially-consistent) and THEN (b)
 implement the needed code generation appropriately for each
 architecture. Indeed on x86 there is little need to insert fence
 instructions, BUT there is a definite need for the compiler to prevent
 certain reorderings. That's why implementing shared data operations
 (whether implicit or explicit) as sheer library code is NOT possible.
Sorry, I didn't see this message of yours before replying (the perils of threaded news readers…). You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion).
Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT.
 Thus, »we«, meaning on a language level, don't need to change anything
 about the current situations, with the possible exception of adding
 finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
 duty of the compiler writers to provide the appropriate means to
 implement druntime on their code generation infrastructure – and indeed,
 the situation in DMD could be improved, using inline asm is hitting a
 fly with a sledgehammer.
That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLoad and atomicStore must have special properties that put them apart from any other functions.
 David


 [1] I am not sure where the point of diminishing returns is here,
 although it might make sense to provide the same options as C++11. If I
 remember correctly, D1/Tango supported a lot more levels of
 synchronization.
We could start with sequential consistency and then explore riskier/looser policies. Andrei
Nov 14 2012
next sibling parent reply Manu <turkeyman gmail.com> writes:
On 14 November 2012 19:54, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 9:31 AM, David Nadlinger wrote:

 On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrot=
e:
 Sorry, I was imprecise. We need to (a) define intrinsics for loading
 and storing data with high-level semantics (a short list: acquire,
 release, acquire+release, and sequentially-consistent) and THEN (b)
 implement the needed code generation appropriately for each
 architecture. Indeed on x86 there is little need to insert fence
 instructions, BUT there is a definite need for the compiler to prevent
 certain reorderings. That's why implementing shared data operations
 (whether implicit or explicit) as sheer library code is NOT possible.
Sorry, I didn't see this message of yours before replying (the perils of threaded news readers=E2=80=A6). You are right about the fact that we need some degree of compiler support for atomic instructions. My point was that is it already available, otherwise it would have been impossible to implement core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which prohibits compiler code motion).
Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT.
I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day. This is a very big deal. I would be scared to see the compiler generate intrinsic calls to atomic synchronisation primitives. It's almost like banning architectures from the language. The Nintendo Wii for instance, not an unpopular machine, only sold 130 million units! Does not have synchronisation instructions in the architecture (insane, I know, but there it is. I've had to spend time working around this in the past). I'm sure it's not unique in this way. People getting fancy with lock-free/atomic operations will probably wrap it up in libraries. And they're not globally applicable, atomic memory operations don't magically solve problems, they require very specific structures and access patterns around them. I'm just not convinced they should be intrinsics issued by the language. They're just not as well standardised as 'int' or 'float'. Side note: I still think a convenient and fairly practical solution is to make 'shared' things 'lockable'; where you can lock()/unlock() them, and assignment to/from shared things is valid (no casting), but a runtime assert insists that the entity is locked whenever it is accessed.* *It's simplistic, but it's safe, and it works with the same primitives that already exist and are proven. Let the programmer mark the lock/unlock moments, worry about sequencing, etc... at least for the time being. Don't try and do it automatically (yet). The broad use cases in D aren't yet known, but making 'shared' useful today would be valuable. Thus, =C2=BBwe=C2=AB, meaning on a language level, don't need to change an= ything
 about the current situations, with the possible exception of adding
 finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
 duty of the compiler writers to provide the appropriate means to
 implement druntime on their code generation infrastructure =E2=80=93 and=
indeed,
 the situation in DMD could be improved, using inline asm is hitting a
 fly with a sledgehammer.
That is correct. My point is that compiler implementers would follow some specification. That specification would contain informationt hat atomicLo=
ad
 and atomicStore must have special properties that put them apart from any
 other functions.


  David
 [1] I am not sure where the point of diminishing returns is here,
 although it might make sense to provide the same options as C++11. If I
 remember correctly, D1/Tango supported a lot more levels of
 synchronization.
We could start with sequential consistency and then explore riskier/loose=
r
 policies.


 Andrei
Nov 15 2012
next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 15/11/2012 10:08, Manu a écrit :
 The Nintendo Wii for instance, not an unpopular machine, only sold 130
 million units! Does not have synchronisation instructions in the
 architecture (insane, I know, but there it is. I've had to spend time
 working around this in the past).
 I'm sure it's not unique in this way.
Can you elaborate on that ?
Nov 15 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/15/12 1:08 AM, Manu wrote:
 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:
     Yah, the whole point here is that we need something IN THE LANGUAGE
     DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.

     THIS IS VERY IMPORTANT.


 I won't outright disagree, but this seems VERY dangerous to me.

 You need to carefully study all popular architectures, and consider that
 if the language is made to depend on these primitives, and the
 architecture doesn't support it, or support that particular style of
 implementation (fairly likely), than D will become incompatible with a
 huge number of architectures on that day.
All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip]
 Side note: I still think a convenient and fairly practical solution is
 to make 'shared' things 'lockable'; where you can lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.
This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused. Andrei
Nov 15 2012
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 7:17 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:
=20
 Side note: I still think a convenient and fairly practical solution =
is
 to make 'shared' things 'lockable'; where you can lock()/unlock() =
them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.
=20 This (IIUC) is conflating mutex-based synchronization with memory =
models and atomic operations. I suggest we postpone anything related to = that for the sake of staying focused. By extension, I'd suggest postponing anything related to classes as = well.=
Nov 15 2012
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 15 November 2012 17:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:

 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.
All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip]
 Side note: I still think a convenient and fairly practical solution is
 to make 'shared' things 'lockable'; where you can lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.
This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
I'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.
Nov 16 2012
parent reply "Pragma Tix" <bizprac orange.fr> writes:
On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:
 On 15 November 2012 17:17, Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:

 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org 
 <mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.
All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip]
 Side note: I still think a convenient and fairly practical 
 solution is
 to make 'shared' things 'lockable'; where you can 
 lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), 
 but a
 runtime assert insists that the entity is locked whenever it 
 is
 accessed.
This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
I'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.
Seems to me that Soenkes's library solution went into to right direction http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com
Nov 16 2012
parent reply Manu <turkeyman gmail.com> writes:
On 16 November 2012 12:09, Pragma Tix <bizprac orange.fr> wrote:

 On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:

 On 15 November 2012 17:17, Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org> wrote:

  On 11/15/12 1:08 AM, Manu wrote:
  On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **e**
 rdani.org <http://erdani.org><SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>


wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.
All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip] Side note: I still think a convenient and fairly practical solution is
 to make 'shared' things 'lockable'; where you can lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.
This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
I'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.
Seems to me that Soenkes's library solution went into to right direction http://forum.dlang.org/post/**k831b6$1368$1 digitalmars.com<http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com>
Looks reasonable to me, also Dmitry Olshansky and luka have both made suggestions that look good to me aswell. I think the only problem with all these is that they don't really feel like a feature of the language, just some template that's not yet even in the library. D likes to claim that it is strong on concurrency, with that in mind, I'd expect to at least see one of these approaches polished, and probably even nicely sugared. That's a minimum that people will expect, it's a proven, well known pattern that many are familiar with, and it can be done in the language right now. Sugaring a feature like that is simply about improving clarity, and reducing friction for users of something that D likes to advertise as being a core feature of the language.
Nov 16 2012
parent "Pragma Tix" <bizprac orange.fr> writes:
On Friday, 16 November 2012 at 10:59:02 UTC, Manu wrote:
 On 16 November 2012 12:09, Pragma Tix <bizprac orange.fr> wrote:

 On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:

 On 15 November 2012 17:17, Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org> wrote:

  On 11/15/12 1:08 AM, Manu wrote:
  On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org 
 <mailto:SeeWebsiteForEmail **e**
 rdani.org 
 <http://erdani.org><SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>


wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.
All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it. [snip] Side note: I still think a convenient and fairly practical solution is
 to make 'shared' things 'lockable'; where you can 
 lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), 
 but a
 runtime assert insists that the entity is locked whenever 
 it is
 accessed.
This (IIUC) is conflating mutex-based synchronization with memory models and atomic operations. I suggest we postpone anything related to that for the sake of staying focused.
I'm not conflating the 2, I'm suggesting to stick with the primitives that are already present and proven, at least for the time being. This thread is about addressing the problem in the short term, long term plans can simmer until they're ready, but any moves in the short term should make use of the primitives available and known to work, ie, don't try and weave in language level support for architectural atomic operations until there's a thoroughly detailed plan, and it's validated against many architectures so we know what we're losing. Libraries can already be written to do a lot of atomic stuff, but I still agree with the OP that shared should be addressed and made more useful in the short term, hence my simplistic suggestion; runtime assert that a shared object is locked when it is read/written, and consequently, lift the cast requirement, making it compatible with templates.
Seems to me that Soenkes's library solution went into to right direction http://forum.dlang.org/post/**k831b6$1368$1 digitalmars.com<http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com>
Looks reasonable to me, also Dmitry Olshansky and luka have both made suggestions that look good to me aswell. I think the only problem with all these is that they don't really feel like a feature of the language, just some template that's not yet even in the library. D likes to claim that it is strong on concurrency, with that in mind, I'd expect to at least see one of these approaches polished, and probably even nicely sugared. That's a minimum that people will expect, it's a proven, well known pattern that many are familiar with, and it can be done in the language right now. Sugaring a feature like that is simply about improving clarity, and reducing friction for users of something that D likes to advertise as being a core feature of the language.
Hi Manu, point taken. But Dimitry and Luka just made suggestions. Soenke offers something concrete. (working right NOW) I am afraid that we'll end up in a situation similar to the std.collections opera. Just bla bla, and zero results. (And the collection situation isn't solved since the very beginning of D, not to talk about immutable collections) Probably not En Vogue : For me Transactional Memory Management makes sense.
Nov 16 2012
prev sibling parent Manu <turkeyman gmail.com> writes:
On 15 November 2012 17:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:

 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote: Yah, the whole point here is that we need something IN THE LANGUAGE DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION. THIS IS VERY IMPORTANT. I won't outright disagree, but this seems VERY dangerous to me. You need to carefully study all popular architectures, and consider that if the language is made to depend on these primitives, and the architecture doesn't support it, or support that particular style of implementation (fairly likely), than D will become incompatible with a huge number of architectures on that day.
All contemporary languages that are serious about concurrency support atomic primitives one way or another. We must too. There's no two ways about it.
I can't resist... D may be serious about the *idea* of concurrency, but it clearly isn't serious about concurrency yet. shared is a prime example of that. We do support atomic primitives 'one way or another'; there are intrinsics on all compilers. Libraries can use them. Again, this thread seemed to be about urgent action... D needs a LOT of work on it's concurrency model, but something of an urgent fix to make a key language feature more useful needs to leverage what's there now.
Nov 16 2012
prev sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei 
Alexandrescu wrote:
 That is correct. My point is that compiler implementers would 
 follow some specification. That specification would contain 
 informationt hat atomicLoad and atomicStore must have special 
 properties that put them apart from any other functions.
What are these special properties? Sorry, it seems like we are talking past each other…
 [1] I am not sure where the point of diminishing returns is 
 here,
 although it might make sense to provide the same options as 
 C++11. If I
 remember correctly, D1/Tango supported a lot more levels of
 synchronization.
We could start with sequential consistency and then explore riskier/looser policies.
I'm not quite sure what you are saying here. The functions in core.atomic already exist, and currently offer four levels (raw, acq, rel, seq). Are you suggesting to remove the other options? David
Nov 15 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.
What are these special properties? Sorry, it seems like we are talking past each other…
For example you can't hoist a memory operation before a shared load or after a shared store. Andrei
Nov 15 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei 
Alexandrescu wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei 
 Alexandrescu wrote:
 That is correct. My point is that compiler implementers would 
 follow
 some specification. That specification would contain 
 informationt hat
 atomicLoad and atomicStore must have special properties that 
 put them
 apart from any other functions.
What are these special properties? Sorry, it seems like we are talking past each other…
For example you can't hoist a memory operation before a shared load or after a shared store.
Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable. But still, you can't move memory operations across any other arbitrary function call either (unless you can prove it is safe by inspecting the callee's body, obviously), so I don't see where atomicLoad/atomicStore would be special here. David
Nov 15 2012
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:

 On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu =
wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu =
wrote:
 That is correct. My point is that compiler implementers would =
follow
 some specification. That specification would contain informationt =
hat
 atomicLoad and atomicStore must have special properties that put =
them
 apart from any other functions.
=20 What are these special properties? Sorry, it seems like we are =
talking
 past each other=85
=20 For example you can't hoist a memory operation before a shared load =
or after a shared store.
=20
 Well, to be picky, that depends on what kind of memory operation you =
mean =96 moving non-volatile loads/stores across volatile ones is = typically considered acceptable. Usually not, really. Like if you implement a mutex, you don't want = non-volatile operations to be hoisted above the mutex acquire or sunk = below the mutex release. However, it's safe to move additional = operations into the block where the mutex is held.=
Nov 15 2012
next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 3:05 PM, David Nadlinger 
 <see klickverbot.at> wrote:
 Well, to be picky, that depends on what kind of memory 
 operation you mean – moving non-volatile loads/stores across 
 volatile ones is typically considered acceptable.
Usually not, really. Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release. However, it's safe to move additional operations into the block where the mutex is held.
Oh well, I was just being stupid when typing up my response: What I meant to say is that you _can_ reorder a set of memory operations involving atomic/volatile ones unless you violate the guarantees of the chosen memory order option. So, for Andrei's statement to be true, shared needs to be defined as making all memory operations sequentially consistent. Walter doesn't seem to think this is the way to go, at least if that is what he is referring to as »memory barriers«. David
Nov 15 2012
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 3:30 PM, David Nadlinger <see klickverbot.at> wrote:

 On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> =
wrote:
 Well, to be picky, that depends on what kind of memory operation you =
mean =96 moving non-volatile loads/stores across volatile ones is = typically considered acceptable.
=20
 Usually not, really.  Like if you implement a mutex, you don't want =
non-volatile operations to be hoisted above the mutex acquire or sunk = below the mutex release. However, it's safe to move additional = operations into the block where the mutex is held.
=20
 Oh well, I was just being stupid when typing up my response: What I =
meant to say is that you _can_ reorder a set of memory operations = involving atomic/volatile ones unless you violate the guarantees of the = chosen memory order option.
=20
 So, for Andrei's statement to be true, shared needs to be defined as =
making all memory operations sequentially consistent. Walter doesn't = seem to think this is the way to go, at least if that is what he is = referring to as =BBmemory barriers=AB. I think because of the as-if rule, the compiler can continue to optimize = all it wants between volatile operations. Just not across them.=
Nov 15 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/15/12 3:30 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:
 Well, to be picky, that depends on what kind of memory operation you
 mean – moving non-volatile loads/stores across volatile ones is
 typically considered acceptable.
Usually not, really. Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release. However, it's safe to move additional operations into the block where the mutex is held.
Oh well, I was just being stupid when typing up my response: What I meant to say is that you _can_ reorder a set of memory operations involving atomic/volatile ones unless you violate the guarantees of the chosen memory order option. So, for Andrei's statement to be true, shared needs to be defined as making all memory operations sequentially consistent. Walter doesn't seem to think this is the way to go, at least if that is what he is referring to as »memory barriers«.
Shared must be sequentially consistent. Andrei
Nov 15 2012
prev sibling parent deadalnix <deadalnix gmail.com> writes:
Le 15/11/2012 15:22, Sean Kelly a écrit :
 On Nov 15, 2012, at 3:05 PM, David Nadlinger<see klickverbot.at>  wrote:

 On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.
What are these special properties? Sorry, it seems like we are talking past each other…
For example you can't hoist a memory operation before a shared load or after a shared store.
Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.
Usually not, really. Like if you implement a mutex, you don't want non-volatile operations to be hoisted above the mutex acquire or sunk below the mutex release. However, it's safe to move additional operations into the block where the mutex is held.
If it is known that the memory read/write is thread local, this is safe, even in the case of a mutex.
Nov 18 2012
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/15/12 3:05 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu
 wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.
What are these special properties? Sorry, it seems like we are talking past each other…
For example you can't hoist a memory operation before a shared load or after a shared store.
Well, to be picky, that depends on what kind of memory operation you mean – moving non-volatile loads/stores across volatile ones is typically considered acceptable.
In D that's fine (as long as in-thread SC is respected) because non-shared vars are guaranteed to be thread-local.
 But still, you can't move memory operations across any other arbitrary
 function call either (unless you can prove it is safe by inspecting the
 callee's body, obviously), so I don't see where atomicLoad/atomicStore
 would be special here.
It is special because e.g. on x86 the function is often a simple unprotected load or store. So after the inliner has at it, there's nothing to stay in the way of reordering. The point is the compiler must understand the semantics of acquire and release. Andrei
Nov 15 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:
 On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?
Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.
 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.
I think at this point we need to develop a better understanding of what's going on before issuing assessments.
Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.
Nov 14 2012
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared" must
 prevent the compiler from reordering them. But that's separate from
 inserting memory barriers.
It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing. Andrei
Nov 14 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared" must
 prevent the compiler from reordering them. But that's separate from
 inserting memory barriers.
=20 It's the same issue at hand: ordering properly and inserting barriers =
are two ways to ensure one single goal, sequential consistency. Same = thing. Sequential consistency is great and all, but it doesn't render = concurrent code correct. At worst, it provides a false sense of = security that somehow it does accomplish this, and people end up = actually using it as such.=
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 4:50 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 2:25 PM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared"
 must prevent the compiler from reordering them. But that's
 separate from inserting memory barriers.
It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing.
Sequential consistency is great and all, but it doesn't render concurrent code correct. At worst, it provides a false sense of security that somehow it does accomplish this, and people end up actually using it as such.
Yah, but the baseline here is acquire-release which has subtle differences that are all the more maddening. Andrei
Nov 14 2012
parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 14, 2012, at 6:28 PM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 4:50 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 2:25 PM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
=20
 On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared"
 must prevent the compiler from reordering them. But that's
 separate from inserting memory barriers.
=20 It's the same issue at hand: ordering properly and inserting barriers are two ways to ensure one single goal, sequential consistency. Same thing.
=20 Sequential consistency is great and all, but it doesn't render concurrent code correct. At worst, it provides a false sense of security that somehow it does accomplish this, and people end up actually using it as such.
=20 Yah, but the baseline here is acquire-release which has subtle =
differences that are all the more maddening. Really? Acquire-release always seemed to have equivalent safety to me. = Typically, the user doesn't even have to understand that optimization = can occur upwards across the trailing boundary of the block, etc, to = produce correct code. Though I do agree that the industry is moving = towards sequential consistency, so there may be no point in trying for = something weaker.=
Nov 15 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 14/11/2012 22:09, Walter Bright a écrit :
 On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:
 On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.
I'm not saying "memory barriers are bad". I'm saying that having the compiler blindly insert them for shared reads/writes is far from the right way to do it.
Andrei
I need some clarification here: By memory barrier, do you mean x86's mfence, sfence, and lfence?
Sorry, I was imprecise. We need to (a) define intrinsics for loading and storing data with high-level semantics (a short list: acquire, release, acquire+release, and sequentially-consistent) and THEN (b) implement the needed code generation appropriately for each architecture. Indeed on x86 there is little need to insert fence instructions, BUT there is a definite need for the compiler to prevent certain reorderings. That's why implementing shared data operations (whether implicit or explicit) as sheer library code is NOT possible.
 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.
I think at this point we need to develop a better understanding of what's going on before issuing assessments.
Yes. And also, I agree that having something typed as "shared" must prevent the compiler from reordering them. But that's separate from inserting memory barriers.
I'm sorry but that is dumb. What is the point of ensuring that the compiler does not reorder load/stores if the CPU is allowed to do so ?
Nov 15 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:
=20
 What is the point of ensuring that the compiler does not reorder =
load/stores if the CPU is allowed to do so ? Because we can write ASM to tell the CPU not to. We don't have any such = ability for the compiler right now.=
Nov 15 2012
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> 
 wrote:
 
 What is the point of ensuring that the compiler does not 
 reorder load/stores if the CPU is allowed to do so ?
Because we can write ASM to tell the CPU not to. We don't have any such ability for the compiler right now.
I think the question was: Why would you want to disable compiler code motion for loads/stores which are not atomic, as the CPU might ruin your assumptions anyway? David
Nov 15 2012
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/15/12 2:18 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:
 What is the point of ensuring that the compiler does not reorder
 load/stores if the CPU is allowed to do so ?
Because we can write ASM to tell the CPU not to. We don't have any such ability for the compiler right now.
I think the question was: Why would you want to disable compiler code motion for loads/stores which are not atomic, as the CPU might ruin your assumptions anyway?
The compiler does whatever it takes to ensure sequential consistency for shared use, including possibly inserting fences in certain places. Andrei
Nov 15 2012
parent "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 15 November 2012 at 22:58:53 UTC, Andrei 
Alexandrescu wrote:
 On 11/15/12 2:18 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly 
 wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> 
 wrote:
 What is the point of ensuring that the compiler does not 
 reorder
 load/stores if the CPU is allowed to do so ?
Because we can write ASM to tell the CPU not to. We don't have any such ability for the compiler right now.
I think the question was: Why would you want to disable compiler code motion for loads/stores which are not atomic, as the CPU might ruin your assumptions anyway?
The compiler does whatever it takes to ensure sequential consistency for shared use, including possibly inserting fences in certain places. Andrei
How does this have anything to do with deadalnix' question that I rephrased at all? It is not at all clear that shared should do this (it currently doesn't), and the question was explicitly about Walter's statement that shared should disable compiler reordering, when at the same time *not* inserting barriers/atomic ops. Thus the »which are not atomic« qualifier in my message. David
Nov 15 2012
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 2:18 PM, David Nadlinger <see klickverbot.at> wrote:

 On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:
 What is the point of ensuring that the compiler does not reorder =
load/stores if the CPU is allowed to do so ?
=20
 Because we can write ASM to tell the CPU not to.  We don't have any =
such ability for the compiler right now.
=20
 I think the question was: Why would you want to disable compiler code =
motion for loads/stores which are not atomic, as the CPU might ruin your = assumptions anyway? A barrier isn't always necessary to achieve the desired ordering on a = given system. But I'd still call out to ASM to make sure the intended = operation happened. I don't know that I'd ever feel comfortable with = "volatile x=3Dy" even if what I'd do instead is just a MOV.=
Nov 15 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-11-14 08:56, Jonathan M Davis wrote:

 Being able to have double-checked locking work would be valuable, and having
 memory barriers would reduce race condition weirdness when locks aren't used
 properly, so I think that it would be desirable to have memory barriers. If
 there's a major performance penalty though, that might be a reason not to do
 it. Certainly, I don't think that there's any question that adding memory
 barriers won't make it so that you don't need mutexes or synchronized blocks
 or whatnot. shared's primary benefit is in logically separating normal code
 from code that must shared data across threads and making it possible for the
 compiler to optimize based on the fact that it knows that a variable is
 thread-local.
If there is a problem with efficiency in some cases then the developer can use __gshared and manually handling things. But of course, we don't want the developer to have to do this in most cases. -- /Jacob Carlborg
Nov 14 2012
prev sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 13.11.2012 23:22, schrieb Walter Bright:
 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.
I still don't agree with you there. The struct would have clearly outlived any thread (as it was in the global scope) so at the point where it is destroyed there should be really only one thread left. So it IS destroyed in a single threaded context. The same is done for classes by the GC just that the GC ignores shared altogether. Kind Regards Benjamin Thaut
Nov 14 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/14/2012 1:01 AM, Benjamin Thaut wrote:
 I still don't agree with you there. The struct would have clearly outlived any
 thread (as it was in the global scope) so at the point where it is destroyed
 there should be really only one thread left. So it IS destroyed in a single
 threaded context.
If you know this for a fact, then cast it to thread local. The compiler cannot figure this out for you, hence it issues the error.
 The same is done for classes by the GC just that the GC
 ignores shared altogether.
That's different, because the GC verifies that there are *no* references to it from any thread first.
Nov 14 2012
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 14.11.2012 10:18, schrieb Walter Bright:
 On 11/14/2012 1:01 AM, Benjamin Thaut wrote:
 I still don't agree with you there. The struct would have clearly
 outlived any
 thread (as it was in the global scope) so at the point where it is
 destroyed
 there should be really only one thread left. So it IS destroyed in a
 single
 threaded context.
If you know this for a fact, then cast it to thread local. The compiler cannot figure this out for you, hence it issues the error.
 The same is done for classes by the GC just that the GC
 ignores shared altogether.
That's different, because the GC verifies that there are *no* references to it from any thread first.
Could you please give an example where it would break? And whats the difference between: struct Value { ~this() { printf("destroy\n"); } } shared Value v; and: shared static ~this() { printf("destory\n"); } Kind Regards Benjamin Thaut
Nov 14 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?
Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
 And whats the difference between:

 struct Value
 {
    ~this()
    {
      printf("destroy\n");
    }
 }

 shared Value v;


 and:


 shared static ~this()
 {
    printf("destory\n");
 }
The struct declaration of ~this() has no idea what context it will be used in.
Nov 14 2012
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?
Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point. And if the use of pointers is allowed, I can make the same case break in a single threaded environment without shared. Kind Regards Benjamin Thaut
Nov 14 2012
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/14/2012 2:49 AM, Benjamin Thaut wrote:
 Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?
Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point.
Pointers are safe. It's pointer arithmetic that is not (and escaping pointers).
 And if the use of pointers is allowed, I can make the same case break in a
 single threaded environment without shared.
1. You can't escape pointers in safe code (well, it's a bug if you do). 2. If the struct is on the heap, it is only destructed if there are no references to it in any thread. If it is not on the heap, and you are in safe code, it should always be destructed safely when it goes out of scope. This is not so for shared pointers.
Nov 14 2012
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 14.11.2012 12:00, schrieb Walter Bright:
 On 11/14/2012 2:49 AM, Benjamin Thaut wrote:
 Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?
Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point.
Pointers are safe. It's pointer arithmetic that is not (and escaping pointers).
 And if the use of pointers is allowed, I can make the same case break
 in a
 single threaded environment without shared.
1. You can't escape pointers in safe code (well, it's a bug if you do). 2. If the struct is on the heap, it is only destructed if there are no references to it in any thread. If it is not on the heap, and you are in safe code, it should always be destructed safely when it goes out of scope. This is not so for shared pointers.
So just to be clear, escaping pointers in a single threaded context is a bug. But if you escape them in a multithreaded context its ok? That sounds inconsistent to me. But if that is by design your argument is valid. I still can not think of any real world usecase though where this could actually be used. A small code example which would break as soon as we allow destructing of shared value types would really be nice. (maybe even in the langauge documentation, because I coudln't find anything) Kind Regards Benjamin Thaut
Nov 14 2012
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
 A small code example which would break as soon as we allow destructing of
shared
 value types would really be nice.
I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
Nov 14 2012
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 1:06 PM, Walter Bright wrote:
 On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
 A small code example which would break as soon as we allow destructing
 of shared
 value types would really be nice.
I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2
That should be disallowed at least in safe code. If I had my way I'd explore disallowing in all code. Andrei
Nov 14 2012
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-14 22:06, Walter Bright wrote:

 I hate to repeat myself, but:

 Thread 1:
      1. create shared object
      2. pass reference to that object to Thread 2
      3. destroy object

 Thread 2:
      1. manipulate that object
Why would the object be destroyed if there's still a reference to it? If the object is manually destroyed I don't see what threads have to do with it since you can do the same thing in a single thread application. -- /Jacob Carlborg
Nov 15 2012
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, November 15, 2012 10:22:22 Jacob Carlborg wrote:
 On 2012-11-14 22:06, Walter Bright wrote:
 I hate to repeat myself, but:
 
 Thread 1:
      1. create shared object
      2. pass reference to that object to Thread 2
      3. destroy object
 
 Thread 2:
      1. manipulate that object
Why would the object be destroyed if there's still a reference to it? If the object is manually destroyed I don't see what threads have to do with it since you can do the same thing in a single thread application.
Yeah. If the reference passed across were shared, then the runtime should see it as having multiple references, and if it's _not_ shared, that means that you cast shared away (unsafe, since it's a cast) and passed it across threads without making sure that it was the only reference on the original thread. In that case, you shot yourself in the foot by using an system construct (casting) and not getting it right. I don't see why the runtime would have to worry about that. Unless the problem is that the object is a value type, so when it goes away on the first thread, it _has_ to be destroyed? If that's the case, then it's a pointer that was passed across rather than a reference, and then you've effectively done the same thing as returning a pointer to a local variable, which is system and again only happens if you're getting system wrong, which the compiler generally doesn't protect you from beyond giving you an error in the few cases where it can determine for certain that what you're doing is wrong (which is a fairly limited portion of the time). So, as far as I can see - unless I'm just totally missing something here - either you're dealing with shared objects on the heap here, in which case, the object shouldn't be destroyed on the first thread unless you do it manually (in which case, you're doing something stupid in system code), or you're dealing with passing pointers to shared value types across threads, which is essentially the equivalent of escaping a pointer to a local variable (in which case, you're doing something stupid in system code). In either case, it's you're doing something stupid in system code, and I don't see why the runtime would have to worry about it. You shot yourself in the foot by incorrectly using system code. If you want protection agains that, then don't use system code. - Jonathan M Davis
Nov 15 2012
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 15.11.2012 12:48, schrieb Jonathan M Davis:
 Yeah. If the reference passed across were shared, then the runtime should see
 it as having multiple references, and if it's _not_ shared, that means that
 you cast shared away (unsafe, since it's a cast) and passed it across threads
 without making sure that it was the only reference on the original thread. In
 that case, you shot yourself in the foot by using an  system construct
 (casting) and not getting it right. I don't see why the runtime would have to
 worry about that.

 Unless the problem is that the object is a value type, so when it goes away on
 the first thread, it _has_ to be destroyed? If that's the case, then it's a
 pointer that was passed across rather than a reference, and then you've
 effectively done the same thing as returning a pointer to a local variable,
 which is  system and again only happens if you're getting  system wrong, which
 the compiler generally doesn't protect you from beyond giving you an error in
 the few cases where it can determine for certain that what you're doing is
 wrong (which is a fairly limited portion of the time).

 So, as far as I can see - unless I'm just totally missing something here -
 either you're dealing with shared objects on the heap here, in which case, the
 object shouldn't be destroyed on the first thread unless you do it manually (in
 which case, you're doing something stupid in  system code), or you're dealing
 with passing pointers to shared value types across threads, which is
 essentially the equivalent of escaping a pointer to a local variable (in which
 case, you're doing something stupid in  system code). In either case, it's
 you're doing something stupid in  system code, and I don't see why the runtime
 would have to worry about it. You shot yourself in the foot by incorrectly
 using  system code. If you want protection agains that, then don't use  system
 code.

 - Jonathan M Davis
Thank you, thats exatcly how I'm thinking too. And because of this it makes absolutley no sense to me to disallow the destruction of a shared struct, if it is allocated on the stack or as a global. If it is allocated on the heap you can't destory it manually anyway because delete is deprecated. And for exatcly this reason I wanted a code example from Walter. Because just listing a few bullet points does not make a real world use case. Kind Regards Benjamin Thaut
Nov 15 2012
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/15/2012 1:06 AM, Walter Bright пишет:
 On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
 A small code example which would break as soon as we allow destructing
 of shared
 value types would really be nice.
I hate to repeat myself, but: Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
Ain't structs typically copied anyway? Reference would imply pointer then. If the struct is on the stack (weird but could be) then the thread that created it destroys the object once. The thing is as unsafe as escaping a pointer is. Personally I think that shared stuff allocated on the stack is here-be-dragons system code in any case. Otherwise it's GC's responsibility to destroy heap allocated struct when there are no references to it. What's so puzzling about it? BTW currently GC-allocated structs are not having their destructor called at all. The bug is however _minor_ ... http://d.puremagic.com/issues/show_bug.cgi?id=2834 -- Dmitry Olshansky
Nov 15 2012
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, November 14, 2012 11:49:22 Benjamin Thaut wrote:
 Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?
Thread 1: 1. create shared object 2. pass reference to that object to Thread 2 3. destroy object Thread 2: 1. manipulate that object
But for passing a reference to a value type you would have to use a pointer, correct? And pointers are a unsafe feature anyway... I don't see your point. And if the use of pointers is allowed, I can make the same case break in a single threaded environment without shared.
Pointers are not considered unsafe at all and are perfectly legal in SafeD. It's ponter _arithmetic_ which is unsafe and therefore considered to be system. - Jonathan M Davis
Nov 14 2012
prev sibling next sibling parent "Jason House" <jason.james.house gmail.com> writes:
On Monday, 12 November 2012 at 02:31:05 UTC, Walter Bright wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
This is a fairly reasonable use of shared, but it is bypassing the type system. Once shared is cast away, it is free to be mixed with thread local variables. Pieces can be assigned to non-shared globals, impure functions can stash reference, weakly pure functions can mix their arguments together, etc... If locking converts shared(T) to bikeshed(T), I bet some of safeD's logic for no escaping references could be used to improve things. It's also interesting to note that casting away shared after taking a lock implicitly means that everything was transitively owned by that lock. I wonder how well a library could promote/enforce such a thing?
Nov 14 2012
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/11/12 6:30 PM, Walter Bright wrote:
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
This is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL). I can't believe I need to restart this on a cold cache. Andrei
Nov 14 2012
next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, November 14, 2012 18:30:56 Andrei Alexandrescu wrote:
 On 11/11/12 6:30 PM, Walter Bright wrote:
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
This is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL). I can't believe I need to restart this on a cold cache.
Well, this is clearly how things work now, and if you want to use shared with much of anything, it's how things generally have to work, because almost nothing takes shared. Templated stuff will at least some of the time (though it's often untested for it and probably will get screwed by Unqual in quite a few cases), but there's no way aside from templates or casting to get shared variables to share the same functions as non-shared ones, leading to code duplication.
From what I recall of what TDPL says, this doesn't really contradict it. It's 
just that TDPL doesn't really say much about the fact that almost nothing will work with shared, which means that casting is necessary. I have no idea what we want to do about this situation though. Regardless of what we do with memory barriers and the like, it has no impact on whether casts are required. And I think that introducing the shared equivalent of const would be a huge mistake, because then most code would end up being written using that attribute, meaning that all code essentially has to be treated as shared from the standpoint of compiler optimizations. It would almost be the same as making everything shared by default again. So, as far as I can see, casting is what we're forced to do. - Jonathan M Davis
Nov 14 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg gmx.com> said:

 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required.
One thing I'm confused about right now is how people are using shared. If you're using shared with atomic operations, then you need barriers when accessing or mutating the variable. If you're using shared with mutexes, spin-locks, etc., you don't care about the barriers. But you can't use it with both at the same time. So which of these shared stands for? In both of these cases, there's an implicit policy for accessing or mutating the variable. I think the language need some way to express that policy. I suggested some time ago a way to protect variables with mutexes so that the compiler can actually help you use those mutexes correctly[1]. The idea was to associate a mutex to the variable declaration. This could be extended to support an atomic access policy. Let me restate and extend that idea to atomic operations. Declare a variable using the synchronized storage class and it automatically get a mutex: synchronized int i; // declaration i++; // error, variable shared synchronized (i) i++; // fine, variable is thread-local inside synchronized block Synchronized here is some kind of storage class causing two things: a mutex is attached to the variable declaration, and the type of the variable is made shared. The variable being shared, you can't access it directly. But a synchronized statement will make the variable non-shared within its bounds. Now, if you want a custom mutex class, write it like this: synchronized(SpinLock) int i; synchronized(i) { // implicit: i.mutexof.lock(); // implicit: scope (exit) i.mutexof.unlock(); i++; } If you want to declare the mutex separately, you could do it by specifying a variable instead of a type in the variable declaration: Mutex m; synchronized(m) int i; synchronized(i) { // implicit: m.lock(); // implicit: scope (exit) m.unlock(); i++; } Also, if you have a read-write mutex and only need read access, you could declare that you only need read access using const: synchronized(RWMutex) int i; synchronized(const i) { // implicit: i.mutexof.constLock(); // implicit: scope (exit) i.mutexof.constUnlock(); i++; // error, i is const } And finally, if you want to use atomic operations, declare it this way: synchronized(Atomic) int i; You can't really synchronize on something protected by Atomic: syncronized(i) // cannot make sycnronized block, no lock/unlock method in Atomic {} But you can call operators on it while synchronized, it works for anything implemented by Atomic: synchronized(i)++; // implicit: Atomic.opUnary!"++"(i); Because the policy object is associated with the variable declaration, when locking the mutex you need direct access to the original variable, or an alias to it. Same for performing atomic operations. You can't pass a reference to some function and have that function perform the locking. If that's a problem it can be avoided by having a way to pass the mutex to the function, or by passing an alias to a template. Okay, this syntax probably still has some problems, feel free to point them out. I don't really care about the syntax though. The important thing is that you need a way to define the policy for accessing the shared data in a way the compiler can actually enforce it and that programmers can actually reuse it. Because right now there is no policy. Having to cast things everywhere is equivalent to having to redefine the policy everywhere. Same for having to write encapsulation types that work with shared for everything you want to share: each type has to implement the policy. There's nothing worse than constantly rewriting the sharing policies. Concurrency error-prone because of all the subtleties; you don't want to encourage people to write policies of their own every time they invent a new type. You need to reuse existing ones, and the compiler can help with that. [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/ -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 14 2012
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Thu, 15 Nov 2012 04:33:20 -0000, Michel Fortin  =

<michel.fortin michelf.ca> wrote:

 On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg gmx.com>=
=
 said:

 I have no idea what we want to do about this situation though.  =
 Regardless of
 what we do with memory barriers and the like, it has no impact on  =
 whether
 casts are required.
Let me restate and extend that idea to atomic operations. Declare a =
 variable using the synchronized storage class and it automatically get=
a =
 mutex:

 	synchronized int i; // declaration

 	i++; // error, variable shared

 	synchronized (i)
 		i++; // fine, variable is thread-local inside synchronized block

 Synchronized here is some kind of storage class causing two things: a =
=
 mutex is attached to the variable declaration, and the type of the  =
 variable is made shared. The variable being shared, you can't access i=
t =
 directly. But a synchronized statement will make the variable non-shar=
ed =
 within its bounds.

 Now, if you want a custom mutex class, write it like this:

 	synchronized(SpinLock) int i;

 	synchronized(i)
 	{
 		// implicit: i.mutexof.lock();
 		// implicit: scope (exit) i.mutexof.unlock();
 		i++;
 	}

 If you want to declare the mutex separately, you could do it by  =
 specifying a variable instead of a type in the variable declaration:

 	Mutex m;
 	synchronized(m) int i;
 	=
 	synchronized(i)
 	{
 		// implicit: m.lock();
 		// implicit: scope (exit) m.unlock();
 		i++;
 	}

 Also, if you have a read-write mutex and only need read access, you  =
 could declare that you only need read access using const:

 	synchronized(RWMutex) int i;

 	synchronized(const i)
 	{
 		// implicit: i.mutexof.constLock();
 		// implicit: scope (exit) i.mutexof.constUnlock();
 		i++; // error, i is const
 	}

 And finally, if you want to use atomic operations, declare it this way=
:
 	synchronized(Atomic) int i;

 You can't really synchronize on something protected by Atomic:

 	syncronized(i) // cannot make sycnronized block, no lock/unlock metho=
d =
 in Atomic
 	{}

 But you can call operators on it while synchronized, it works for  =
 anything implemented by Atomic:

 	synchronized(i)++; // implicit: Atomic.opUnary!"++"(i);

 Because the policy object is associated with the variable declaration,=
=
 when locking the mutex you need direct access to the original variable=
, =
 or an alias to it. Same for performing atomic operations. You can't pa=
ss =
 a reference to some function and have that function perform the lockin=
g. =
 If that's a problem it can be avoided by having a way to pass the mute=
x =
 to the function, or by passing an alias to a template.
+1 I suggested something similar as did S=F6nke: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos= t-op.wnnuiio554xghj:40puck.auriga.bhead.co.uk According to deadalnix the compiler magic I suggested to add the mutex = isn't possible: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D3#pos= t-k7qsb5:242gqk:241:40digitalmars.com Most of our ideas can be implemented with a wrapper template containing = = the sync object (mutex, etc). So... my feeling is that the best solution for "shared", ignoring the = memory barrier aspect which I would relegate to a different feature and = = solve a different way, is.. 1. Remove the existing mutex from object. 2. Require that all objects passed to synchronized() {} statements = implement a synchable(*) interface 3. Design a Shared(*) wrapper template/struct that contains a mutex and = = implements synchable(*) 4. Design a Shared(*) base class which contains a mutex and implements = synchable(*) Then we design classes which are always shared using the base class and = we = wrap other objects we want to share in Shared() and use them in = synchronized statements. This would then relegate any builtin "shared" statement to be solely a = storage class which makes the object global and not thread local. (*) names up for debate R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 15 2012
parent Sean Kelly <sean invisibleduck.org> writes:
On Nov 15, 2012, at 3:16 AM, Regan Heath <regan netmail.co.nz> wrote:
=20
 I suggested something similar as did S=F6nke:
 =
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#post-= op.wnnuiio554xghj:40puck.auriga.bhead.co.uk
=20
 According to deadalnix the compiler magic I suggested to add the mutex =
isn't possible:
 =
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D3#post-= k7qsb5:242gqk:241:40digitalmars.com
=20
 Most of our ideas can be implemented with a wrapper template =
containing the sync object (mutex, etc). If I understand you correctly, you don't need anything that explicitly = contains the sync object. A global table of mutexes used according to = the address of the value to be mutated should work.
 So... my feeling is that the best solution for "shared", ignoring the =
memory barrier aspect which I would relegate to a different feature and = solve a different way, is..
=20
 1. Remove the existing mutex from object.
 2. Require that all objects passed to synchronized() {} statements =
implement a synchable(*) interface
 3. Design a Shared(*) wrapper template/struct that contains a mutex =
and implements synchable(*)
 4. Design a Shared(*) base class which contains a mutex and implements =
synchable(*) It would be nice to eliminate the mutex that's optionally built into = classes now. The possibility of having to allocate a new mutex on = whatever random function call happens to be the first one with = "synchronized" is kinda not great.=
Nov 15 2012
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/15/2012 8:33 AM, Michel Fortin пишет:

 If you want to declare the mutex separately, you could do it by
 specifying a variable instead of a type in the variable declaration:

      Mutex m;
      synchronized(m) int i;

      synchronized(i)
      {
          // implicit: m.lock();
          // implicit: scope (exit) m.unlock();
          i++;
      }
While the rest of proposal was more or less fine. I don't get why we need escape control of mutex at all - in any case it just opens a possibility to shout yourself in the foot. I'd say: "Need direct access to mutex? - Go on with the manual way it's still right there (and scope(exit) for that matter)". Another problem is that somebody clever can escape reference to unlocked 'i' inside of synchronized to somewhere else. But anyway we can make it in the library right about now. synchronized T ---> Synchronized!T synchronized(i){ ... } ---> i.access((x){ //will lock & cast away shared T inside of it ... }); I fail to see what it doesn't solve (aside of syntactic sugar). The key point is that Synchronized!T is otherwise an opaque type. We could pack a few other simple primitives like 'load', 'store' etc. All of them will go through lock-unlock. Even escaping a reference can be solved by passing inside of 'access' a proxy of T. It could even asserts that the lock is in indeed locked. Same goes about Atomic!T. Though the set of primitives is quite limited depending on T. (I thought that built-in shared(T) is already atomic though so no need to reinvent this wheel) It's time we finally agree that 'shared' qualifier is an assembly language of multi-threading based on sharing. It just needs some safe patterns in the library. That and clarifying explicitly what guarantees (aside from being well.. being shared) it provides w.r.t. memory model. Until reaching this thread I was under impression that shared means: - globally visible - atomic operations for stuff that fits in one word - sequentially consistent guarantee - any other forms of access are disallowed except via casts -- Dmitry Olshansky
Nov 15 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:

 11/15/2012 8:33 AM, Michel Fortin пишет:
 
 If you want to declare the mutex separately, you could do it by
 specifying a variable instead of a type in the variable declaration:
 
      Mutex m;
      synchronized(m) int i;
 
      synchronized(i)
      {
          // implicit: m.lock();
          // implicit: scope (exit) m.unlock();
          i++;
      }
While the rest of proposal was more or less fine. I don't get why we need escape control of mutex at all - in any case it just opens a possibility to shout yourself in the foot.
In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id; int addObject(Object o) { synchronized(next_id, objects_by_id) return objects_by_id[next_id++] = o; } Here it doesn't make sense and is less efficient to have two mutexes, since every time you need to lock on next_id you'll also want to lock on objects_by_id. I'm not sure how you could shoot yourself in the foot with this. You might get worse performance if you reuse the same mutex for too many things, just like you might get better performance if you use it wisely.
 But anyway we can make it in the library right about now.
 
 synchronized T ---> Synchronized!T
 synchronized(i){ ... } --->
 
 i.access((x){
 //will lock & cast away shared T inside of it
 	...
 });
 
 I fail to see what it doesn't solve (aside of syntactic sugar).
It solves the problem too. But it's significantly more inconvenient to use. Here's my example above redone using Syncrhonized!T: Synchronized!(Tuple!(int, Object[int])) objects_by_id; int addObject(Object o) { int id; objects_by_id.access((obj_by_id){ id = obj_by_id[1][obj_by_id[0]++] = o; }; return id; } I'm not sure if I have to explain why I prefer the first one or not, to me it's pretty obvious.
 The key point is that Synchronized!T is otherwise an opaque type.
 We could pack a few other simple primitives like 'load', 'store' etc. 
 All of them will go through lock-unlock.
Our proposals are pretty much identical. Your works by wrapping a variable in a struct template, mine is done with a policy object/struct associated with a variable. They'll produce the same code and impose the same restrictions.
 Even escaping a reference can be solved by passing inside of 'access'
 a proxy of T. It could even asserts that the lock is in indeed locked.
Only if you can make a proxy object that cannot leak a reference. It's already not obvious how to not leak the top-level reference, but we must also consider the case where you're protecting a data structure with the mutex and get a pointer to one of its part, like if you slice a container. This is a hard problem. The language doesn't have a solution to that yet. However, having the link between the access policy and the variable known by the compiler makes it easier patch the hole later. What bothers me currently is that because we want to patch all the holes while not having all the necessary tools in the language to avoid escaping references, we just make using mutexes and things alike impossible without casts at every corner, which makes things even more bug prone than being able to escape references in the first place. There are many perils in concurrency, and the compiler cannot protect you from them all. It is of the uttermost importance that code dealing with mutexes be both readable and clear about what it is doing. Casts in this context are an obfuscator.
 Same goes about Atomic!T. Though the set of primitives is quite limited 
 depending on T.
 (I thought that built-in shared(T) is already atomic though so no need 
 to reinvent this wheel)
 
 It's time we finally agree that 'shared' qualifier is an assembly 
 language of multi-threading based on sharing. It just needs some safe 
 patterns in the library.
 
 That and clarifying explicitly what guarantees (aside from being well.. 
 being shared) it provides w.r.t. memory model.
 
 Until reaching this thread I was under impression that shared means:
 - globally visible
 - atomic operations for stuff that fits in one word
 - sequentially consistent guarantee
 - any other forms of access are disallowed except via casts
Built-in shared(T) atomicity (sequential consistency) is a subject of debate in this thread. It is not clear to me what will be the conclusion, but the way I see things atomicity is just one of the many policies you may want to use for keeping consistency when sharing data between threads. I'm not trilled by the idea of making everything atomic by default. That'll lure users to the bug-prone expert-only path while relegating the more generally applicable protection systems (mutexes) as a second-class citizen. I think it's better that you just can't do anything with shared, or that shared simply disappear, and that those variables that must be shared be accessible only through some kind of access policy. Atomic access should be one of those access policies, on an equal footing with other ones. But if D2 is still "frozen" -- as it was meant to be when TDPL got out -- and only minor changes can be made to it now, I don't see much hope for its concurrency model. Your Syncronized!T and Atomic!T wrappers might be the best thing we can hope for, but they're nothing to set D apart from its rivals (I could implement that easily in C++ for instance). -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 16 2012
next sibling parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 16.11.2012 14:17, schrieb Michel Fortin:
 
 Only if you can make a proxy object that cannot leak a reference. It's already
not obvious how to
 not leak the top-level reference, but we must also consider the case where
you're protecting a data
 structure with the mutex and get a pointer to one of its part, like if you
slice a container.
 
 This is a hard problem. The language doesn't have a solution to that yet.
However, having the link
 between the access policy and the variable known by the compiler makes it
easier patch the hole later.
 
 What bothers me currently is that because we want to patch all the holes while
not having all the
 necessary tools in the language to avoid escaping references, we just make
using mutexes and things
 alike impossible without casts at every corner, which makes things even more
bug prone than being
 able to escape references in the first place.
 
 There are many perils in concurrency, and the compiler cannot protect you from
them all. It is of
 the uttermost importance that code dealing with mutexes be both readable and
clear about what it is
 doing. Casts in this context are an obfuscator.
 
Can you have a look at my thread about this? http://forum.dlang.org/thread/k831b6$1368$1 digitalmars.com I would of course favor a nicely integrated language solution that is able to lift as many restrictions as possible, while still keeping everything statically verified [I would also like to have a language solution to Rebindable!T ;)]. But as an alternative to just a years lasting discussion, which does not lead to any agreed upon solution, I'd much rather have such a library solution - it can do a lot, is reasonably pretty, and is (supposedly and with a small exception) fully safe.
Nov 16 2012
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin michelf.ca> =
wrote:

 On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> =
said:
=20
 11/15/2012 8:33 AM, Michel Fortin =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 If you want to declare the mutex separately, you could do it by
 specifying a variable instead of a type in the variable declaration:
     Mutex m;
     synchronized(m) int i;
     synchronized(i)
     {
         // implicit: m.lock();
         // implicit: scope (exit) m.unlock();
         i++;
     }
While the rest of proposal was more or less fine. I don't get why we =
need escape control of mutex at all - in any case it just opens a = possibility to shout yourself in the foot.
=20
 In case you want to protect two variables (or more) with the same =
mutex. For instance:
=20
 	Mutex m;
 	synchronized(m) int next_id;
 	synchronized(m) Object[int] objects_by_id;
=20
 	int addObject(Object o)
 	{
 		synchronized(next_id, objects_by_id)
 			return objects_by_id[next_id++] =3D o;
 	}
=20
 Here it doesn't make sense and is less efficient to have two mutexes, =
since every time you need to lock on next_id you'll also want to lock on = objects_by_id.
=20
 I'm not sure how you could shoot yourself in the foot with this. You =
might get worse performance if you reuse the same mutex for too many = things, just like you might get better performance if you use it wisely. This is what setSameMutex was intended for in Druntime. Except that no = one uses it and people have requested that it be removed. Perhaps = that's because the semantics aren't great though.=
Nov 16 2012
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-16 15:23:37 +0000, Sean Kelly <sean invisibleduck.org> said:

 On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin michelf.ca> wrote:
 
 On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:
 
 While the rest of proposal was more or less fine. I don't get why we
 need escape control of mutex at all - in any case it just opens a
 possibility to shout yourself in the foot.
In case you want to protect two variables (or more) with the same mutex.
This is what setSameMutex was intended for in Druntime. Except that no one uses it and people have requested that it be removed. Perhaps that's because the semantics aren't great though.
Perhaps it's just my style of coding, but when designing a class that needs to be shared in C++, I usually use one mutex to protect only a couple of variables inside the object. That might mean I have two mutexes in one class for two sets of variables if it fits the access pattern. I also make the mutex private so that derived classes cannot access it. The idea is to strictly control what happens when each mutex is locked so that I can make sure I never have two mutexes locked at the same time without looking at the whole code base. This is to avoid deadlocks, and also it removes the need for recursive mutexes. I'd like the language to help me enforce this pattern, and what I'm proposing goes in that direction. Regarding setSameMutex, I'd argue that the semantics of having one mutex for a whole object isn't great. Mutexes shouldn't protect types, they should protect variables. Whether a class needs to protect its variables and how it does it is an implementation detail that shouldn't be leaked to the outside world. What the outside world should know is whether the object is thread-safe or not. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 17 2012
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/16/2012 5:17 PM, Michel Fortin пишет:
 On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com>
 said:
 While the rest of proposal was more or less fine. I don't get why we
 need escape control of mutex at all - in any case it just opens a
 possibility to shout yourself in the foot.
In case you want to protect two variables (or more) with the same mutex. For instance: Mutex m; synchronized(m) int next_id; synchronized(m) Object[int] objects_by_id;
Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;
      int addObject(Object o)
      {
          synchronized(next_id, objects_by_id)
...synchronized(objRepo) with(objRepo)... Though I'd rather use it as struct directly.
              return objects_by_id[next_id++] = o;
      }

 Here it doesn't make sense and is less efficient to have two mutexes,
 since every time you need to lock on next_id you'll also want to lock on
 objects_by_id.
Yes. But we shouldn't close our eyes on the rest of language for how to implement this. Moreover it makes more sense to pack related stuff (that is under a single lock) into a separate entity.
 I'm not sure how you could shoot yourself in the foot with this. You
 might get worse performance if you reuse the same mutex for too many
 things, just like you might get better performance if you use it wisely.
Easily - now the mutex is separate and there is no guarantee that it won't get used for something else then intended. The declaration implies the connection but I do not see anything preventing it from abuse.
 But anyway we can make it in the library right about now.

 synchronized T ---> Synchronized!T
 synchronized(i){ ... } --->

 i.access((x){
 //will lock & cast away shared T inside of it
     ...
 });

 I fail to see what it doesn't solve (aside of syntactic sugar).
It solves the problem too. But it's significantly more inconvenient to use. Here's my example above redone using Syncrhonized!T: Synchronized!(Tuple!(int, Object[int])) objects_by_id; int addObject(Object o) { int id; objects_by_id.access((obj_by_id){ id = obj_by_id[1][obj_by_id[0]++] = o; }; return id; } I'm not sure if I have to explain why I prefer the first one or not, to me it's pretty obvious.
If we made a tiny change in the language that would allow different syntax for passing delegates mine would shine. Such a change at the same time enables more nice way to abstract away control flow. Imagine: access(object_by_id){ ... }; to be convertible to: (x){with(x){ ... }})(access(object_by_id)); More generally speaking a lowering: expression { ... } --> (x){with(x){ ... }}(expression); AFIAK it doesn't conflict with anything. Or wait a sec. Even simpler idiom and no extra features. Drop the idea of 'access' taking a delegate. The other library idiom is to return a RAII proxy that locks/unlocks an object on construction/destroy. with(lock(object_by_id)) { ... do what you like } Fine by me. And C++ can't do it ;)
 The key point is that Synchronized!T is otherwise an opaque type.
 We could pack a few other simple primitives like 'load', 'store' etc.
 All of them will go through lock-unlock.
Our proposals are pretty much identical. Your works by wrapping a variable in a struct template, mine is done with a policy object/struct associated with a variable. They'll produce the same code and impose the same restrictions.
I kind of wanted to point out this disturbing thought about your proposal. That is a lot of extra syntax and rules added buys us very small gain - prettier syntax.
 Even escaping a reference can be solved by passing inside of 'access'
 a proxy of T. It could even asserts that the lock is in indeed locked.
Only if you can make a proxy object that cannot leak a reference. It's already not obvious how to not leak the top-level reference, but we must also consider the case where you're protecting a data structure with the mutex and get a pointer to one of its part, like if you slice a container. This is a hard problem. The language doesn't have a solution to that yet. However, having the link between the access policy and the variable known by the compiler makes it easier patch the hole later.
It need not be 100% malicious dambass proof. Basic foolproofness is OK. See my sketch, it could be vastly improved: https://gist.github.com/4089706 See also Ludwig's work. Though he is focused on classes and their monitor mutex.
 What bothers me currently is that because we want to patch all the holes
 while not having all the necessary tools in the language to avoid
 escaping references, we just make using mutexes and things alike
 impossible without casts at every corner, which makes things even more
 bug prone than being able to escape references in the first place.
Well it kind of double-edged. However I do think we need more general tools in the language and niche ones in the library. Precisely because you can pack tons of niche and miscellaneous stuff on the bookshelf ;) Locks & the works are niche stuff enabling a lot more of common things.
 There are many perils in concurrency, and the compiler cannot protect
 you from them all. It is of the uttermost importance that code dealing
 with mutexes be both readable and clear about what it is doing. Casts in
 this context are an obfuscator.
See below about high-level primitives. The code dealing with mutexes has to be small and isolated anyway. Encouraging pattern of 'just grab the lock and you are golden' is even worse (cause it won't break as fast and hard as e.g. naive atomics will).
 That and clarifying explicitly what guarantees (aside from being
 well.. being shared) it provides w.r.t. memory model.

 Until reaching this thread I was under impression that shared means:
 - globally visible
 - atomic operations for stuff that fits in one word
 - sequentially consistent guarantee
 - any other forms of access are disallowed except via casts
Built-in shared(T) atomicity (sequential consistency) is a subject of debate in this thread. It is not clear to me what will be the conclusion, but the way I see things atomicity is just one of the many policies you may want to use for keeping consistency when sharing data between threads. I'm not trilled by the idea of making everything atomic by default. That'll lure users to the bug-prone expert-only path while relegating the more generally applicable protection systems (mutexes) as a second-class citizen.
That's why I think people shouldn't have to use mutexes at all. Explicitly - provide folks with blocking queues, Synchronized!T, concurrent containers (e.g. hash map) and what not. Even Java has some useful incarnations of these.
 I think it's better that you just can't do
 anything with shared, or that shared simply disappear, and that those
 variables that must be shared be accessible only through some kind of
 access policy. Atomic access should be one of those access policies, on
 an equal footing with other ones.
This is where casts will be a most unwelcome obfuscator and there is no sensible way to de-obscure it by using higher level primitives. Having to say Atomic!X is workable though.
 But if D2 is still "frozen" -- as it was meant to be when TDPL got out
 -- and only minor changes can be made to it now, I don't see much hope
 for its concurrency model. Your Syncronized!T and Atomic!T wrappers
 might be the best thing we can hope for, but they're nothing to set D
 apart from its rivals (I could implement that easily in C++ for instance).
Yeah, but we may tweak some syntax in terms of one lowering or a couple. I'm of strong opinion that lock-based multi-threading needs no _specific_ built-in support in the language. The case is niche and hardly useful outside of certain help with doing safe high-level primitives in the library. As for client code it doesn't care that much. Compared to C++ there is one big thing. That is no-shared by default. This alone should be immensely helpful especially when dealing with 3rd party libraries that 'try hard to be thread-safe' except that they are usually not. -- Dmitry Olshansky
Nov 16 2012
parent reply Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:

 11/16/2012 5:17 PM, Michel Fortin пишет:
 In case you want to protect two variables (or more) with the same mutex.
 For instance:
 
      Mutex m;
      synchronized(m) int next_id;
      synchronized(m) Object[int] objects_by_id;
 
Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;
I guess that'd be fine too.
 If we made a tiny change in the language that would allow different 
 syntax for passing delegates mine would shine. Such a change at the 
 same time enables more nice way to abstract away control flow.
 
 Imagine:
 
 access(object_by_id){
 	...	
 };
 
 to be convertible to:
 
 (x){with(x){
 	...
 }})(access(object_by_id));
 
 More generally speaking a lowering:
 
 expression { ... }
 -->
 (x){with(x){ ... }}(expression);
 
 AFIAK it doesn't conflict with anything.
 
 Or wait a sec. Even simpler idiom and no extra features.
 Drop the idea of 'access' taking a delegate. The other library idiom is 
 to return a RAII proxy that locks/unlocks an object on 
 construction/destroy.
 
 with(lock(object_by_id))
 {
 	... do what you like
 }
 
 Fine by me. And C++ can't do it ;)
Clever. But you forgot to access the variable somewhere. What's its name within the with block? Your code would be clearer this way: { auto locked_object_by_id = lock(object_by_id); // … do what you like } And yes you can definitely do that in C++. I maintain that the "synchronized (var)" syntax is still much clearer, and greppable too. That could be achieved with an appropriate lowering.
 The key point is that Synchronized!T is otherwise an opaque type.
 We could pack a few other simple primitives like 'load', 'store' etc.
 All of them will go through lock-unlock.
Our proposals are pretty much identical. Your works by wrapping a variable in a struct template, mine is done with a policy object/struct associated with a variable. They'll produce the same code and impose the same restrictions.
I kind of wanted to point out this disturbing thought about your proposal. That is a lot of extra syntax and rules added buys us very small gain - prettier syntax.
Sometime having something built in the language is important: it gives first-class status to some constructs. For instance: arrays. We don't need language-level arrays in D, we could just use a struct template that does the same thing. By integrating a feature into the language we're sending the message that this is *the* way to do it, as no other way can stand on equal footing, preventing infinite reimplementation of the concept within various libraries. You might be right however than mutex-protected variables do not deserve this first class status.
 Built-in shared(T) atomicity (sequential consistency) is a subject of
 debate in this thread. It is not clear to me what will be the
 conclusion, but the way I see things atomicity is just one of the many
 policies you may want to use for keeping consistency when sharing data
 between threads.
 
 I'm not trilled by the idea of making everything atomic by default.
 That'll lure users to the bug-prone expert-only path while relegating
 the more generally applicable protection systems (mutexes) as a
 second-class citizen.
That's why I think people shouldn't have to use mutexes at all. Explicitly - provide folks with blocking queues, Synchronized!T, concurrent containers (e.g. hash map) and what not. Even Java has some useful incarnations of these.
I wouldn't say they shouldn't use mutexes at all, but perhaps you're right that they don't deserve first-class treatment. I still maintain that "syncronized (var)" should work, for clarity and consistency reasons, but using a template such as Synchronized!T when declaring the variable might be the best solution. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 17 2012
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-11-17 14:22, Michel Fortin wrote:

 Sometime having something built in the language is important: it gives
 first-class status to some constructs. For instance: arrays. We don't
 need language-level arrays in D, we could just use a struct template
 that does the same thing. By integrating a feature into the language
 we're sending the message that this is *the* way to do it, as no other
 way can stand on equal footing, preventing infinite reimplementation of
 the concept within various libraries.
If a feature can be implemented in a library with the same syntax, semantic and performance I see no reason to put it in the language. -- /Jacob Carlborg
Nov 17 2012
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11/17/2012 5:22 PM, Michel Fortin пишет:
 On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com>
 Or wait a sec. Even simpler idiom and no extra features.
 Drop the idea of 'access' taking a delegate. The other library idiom
 is to return a RAII proxy that locks/unlocks an object on
 construction/destroy.

 with(lock(object_by_id))
 {
     ... do what you like
 }

 Fine by me. And C++ can't do it ;)
Clever. But you forgot to access the variable somewhere. What's its name within the with block?
Not having the name would imply you can't escape it :) But I agree it's not always clear where the writes go to when doing things inside the with block.
Your code would be clearer this way:

      {
          auto locked_object_by_id = lock(object_by_id);
          // … do what you like
      }

 And yes you can definitely do that in C++.
Well, I actually did it in the past when C++0x was relatively new. I just thought 'with' makes it more interesting. As to how access the variable - it depends on what it is.
 I maintain that the "synchronized (var)" syntax is still much clearer,
 and greppable too. That could be achieved with an appropriate lowering.
Yes! If we could make synchronized to be user-hookable this all would be more clear and generally useful. There was a discussion about providing a user defined semantics for synchronized block. It was clear and useful and a lot of folks were favorable of it. Yet it wasn't submitted as a proposal. All other things being equal I believe we should go in this direction - amend a couple of things (say add a user-hookable synchronized) and start laying bricks for std.sharing. -- Dmitry Olshansky
Nov 17 2012
prev sibling parent reply "foobar" <foo bar.com> writes:
On Saturday, 17 November 2012 at 13:22:23 UTC, Michel Fortin 
wrote:
 On 2012-11-16 18:56:28 +0000, Dmitry Olshansky 
 <dmitry.olsh gmail.com> said:

 11/16/2012 5:17 PM, Michel Fortin пишет:
 In case you want to protect two variables (or more) with the 
 same mutex.
 For instance:
 
     Mutex m;
     synchronized(m) int next_id;
     synchronized(m) Object[int] objects_by_id;
 
Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;
I guess that'd be fine too.
<snip> That solution does not work in the general case. More specifically any graph-like data structure. E.g a linked-lists, trees, etc.. Think for example an insert to a shared AVL tree.
Nov 19 2012
parent Michel Fortin <michel.fortin michelf.ca> writes:
On 2012-11-19 09:31:46 +0000, "foobar" <foo bar.com> said:

 On Saturday, 17 November 2012 at 13:22:23 UTC, Michel Fortin wrote:
 On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:
 
 11/16/2012 5:17 PM, Michel Fortin пишет:
 In case you want to protect two variables (or more) with the same mutex.
 For instance:
 
     Mutex m;
     synchronized(m) int next_id;
     synchronized(m) Object[int] objects_by_id;
 
Wrap in a struct and it would be even much clearer and safer. struct ObjectRepository { int next_id; Object[int] objects_by_id; } //or whatever that combination indicates anyway synchronized ObjectRepository objeRepo;
I guess that'd be fine too.
<snip> That solution does not work in the general case. More specifically any graph-like data structure. E.g a linked-lists, trees, etc.. Think for example an insert to a shared AVL tree.
No solution will be foolproof in the general case unless we add new type modifiers to the language to prevent escaping references, something Walter is reluctant to do. So whatever we do with mutexes it'll always be a leaky abstraction. I'm not too trilled by this either. -- Michel Fortin michel.fortin michelf.ca http://michelf.ca/
Nov 19 2012
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 11/15/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 From what I recall of what TDPL says
It says (on p.413) reading and writing shared values are guaranteed to be atomic, for pointers, arrays, function pointers, delegates, class references, and struct types containing exactly one of these types. Reals are not supported. It also talks about automatically inserting memory barriers on page 414.
Nov 14 2012
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required. And I think that introducing the shared equivalent of
 const would be a huge mistake, because then most code would end up being
 written using that attribute, meaning that all code essentially has to be
 treated as shared from the standpoint of compiler optimizations. It would
 almost be the same as making everything shared by default again. So, as far
 as I can see, casting is what we're forced to do.
Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff. - Jonathan M Davis
Nov 14 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 11/14/12 7:24 PM, Jonathan M Davis wrote:
 On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required. And I think that introducing the shared equivalent of
 const would be a huge mistake, because then most code would end up being
 written using that attribute, meaning that all code essentially has to be
 treated as shared from the standpoint of compiler optimizations. It would
 almost be the same as making everything shared by default again. So, as far
 as I can see, casting is what we're forced to do.
Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff. - Jonathan M Davis
TDPL 13.14 explains that inside synchronized classes, top-level shared is automatically lifted. Andrei
Nov 14 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, November 14, 2012 20:32:35 Andrei Alexandrescu wrote:
 TDPL 13.14 explains that inside synchronized classes, top-level shared
 is automatically lifted.
Then it's doing the casting for you. I suppose that that's an argument that using synchronized classes when dealing with shared is the way to go (which IIRC TDPL does argue), but that only applies to classes, and there are plenty of cases (maybe even the majority) where it's built-in types like arrays or AAs which people are trying to share, and synchronized classes won't help them there unless they create wrapper types. And explicit casting will be required for them. And of course, anyone wanting to use mutexes or synchronized blocks will have to use explicit casts regardless of what they're protecting, because it won't be inside a synchronized class. So, while synchronized classes make dealing with classes nicer, they only handle a very specific portion of what might be used with shared. In any case, I clearly need to reread TDPL's threading stuff (and maybe the whole book). It's been a while since I read it, and I'm getting rusty on the details. By the way, speaking of synchronized classes, as I understand it, they're still broken with regards to TDPL in that synchronized is still used on functions rather than classes like TDPL describes. So, they aren't currently a solution regardless of what the language actual design is supposed to be. Obviously, that should be fixed though. - Jonathan M Davis
Nov 15 2012
prev sibling parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 15.11.2012 05:32, schrieb Andrei Alexandrescu:
 On 11/14/12 7:24 PM, Jonathan M Davis wrote:
 On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required. And I think that introducing the shared equivalent of
 const would be a huge mistake, because then most code would end up being
 written using that attribute, meaning that all code essentially has to be
 treated as shared from the standpoint of compiler optimizations. It would
 almost be the same as making everything shared by default again. So, as far
 as I can see, casting is what we're forced to do.
Actually, I think that what it comes down to is that shared works nicely when you have a type which is designed to be shared, and it encapsulates everything that it needs. Where it starts requiring casting is when you need to pass it to other stuff. - Jonathan M Davis
TDPL 13.14 explains that inside synchronized classes, top-level shared is automatically lifted. Andrei
There are three problems I currently see with this: - It's not actually implemented - It's not safe because unshared references can be escaped or dragged in - Synchronized classes provide no way to avoid the automatic locking in certain methods, but often it is necessary to have more fine-grained control for efficiency reasons, or to avoid dead-locks
Nov 15 2012
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On 15 November 2012 04:30, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/11/12 6:30 PM, Walter Bright wrote:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
This is very different from how I view we should do things (and how we actually agreed to do things and how I wrote in TDPL). I can't believe I need to restart this on a cold cache.
The pattern Walter describes is primitive and useful, I'd like to see shared assist to that end (see my previous post). You can endeavour to do any other fancy stuff you like, but until some distant future when it's actually done, then proven and well supported, I'll keep doing this. Not to repeat my prev post... but in reply to Walter's take on it, it would be interesting if 'shared' just added implicit lock()/unlock() methods to do the mutex acquisition and then remove the cast requirement, but have the language runtime assert that the object is locked whenever it is accessed (this guarantees the safety in a more useful way, the casts are really annying). I can't imagine a simpler and more immediately useful solution. In fact, it's a reasonably small step to this being possible with user-defined attributes. Although attributes have no current mechanism to add a mutex, and lock/unlock methods to the object being attributed (like
Nov 15 2012
parent reply Jacob Carlborg <doob me.com> writes:
On 2012-11-15 10:22, Manu wrote:

 Not to repeat my prev post... but in reply to Walter's take on it, it
 would be interesting if 'shared' just added implicit lock()/unlock()
 methods to do the mutex acquisition and then remove the cast
 requirement, but have the language runtime assert that the object is
 locked whenever it is accessed (this guarantees the safety in a more
 useful way, the casts are really annying). I can't imagine a simpler and
 more immediately useful solution.
How about implementing a library function, something like this: shared int i; lock(i, (x) { // operate on x }); * "lock" will acquire a lock * Cast away shared for "i" * Call the delegate with the now plain "int" * Release the lock http://pastebin.com/tfQ12nJB -- /Jacob Carlborg
Nov 15 2012
parent reply Manu <turkeyman gmail.com> writes:
On 15 November 2012 12:14, Jacob Carlborg <doob me.com> wrote:

 On 2012-11-15 10:22, Manu wrote:

  Not to repeat my prev post... but in reply to Walter's take on it, it
 would be interesting if 'shared' just added implicit lock()/unlock()
 methods to do the mutex acquisition and then remove the cast
 requirement, but have the language runtime assert that the object is
 locked whenever it is accessed (this guarantees the safety in a more
 useful way, the casts are really annying). I can't imagine a simpler and
 more immediately useful solution.
How about implementing a library function, something like this: shared int i; lock(i, (x) { // operate on x }); * "lock" will acquire a lock * Cast away shared for "i" * Call the delegate with the now plain "int" * Release the lock http://pastebin.com/tfQ12nJB
Interesting concept. Nice idea, could certainly be useful, but it doesn't address the problem as directly as my suggestion. There are still many problem situations, for instance, any time a template is involved. The template doesn't know to do that internally, but under my proposal, you lock it prior to the workload, and then the template works as expected. Templates won't just break and fail whenever shared is involved, because assignments would be legal. They'll just assert that the thing is locked at the time, which is the programmers responsibility to ensure.
Nov 15 2012
next sibling parent luka8088 <luka8088 owave.net> writes:
On 15.11.2012 11:52, Manu wrote:
 On 15 November 2012 12:14, Jacob Carlborg <doob me.com
 <mailto:doob me.com>> wrote:

     On 2012-11-15 10:22, Manu wrote:

         Not to repeat my prev post... but in reply to Walter's take on
         it, it
         would be interesting if 'shared' just added implicit lock()/unlock()
         methods to do the mutex acquisition and then remove the cast
         requirement, but have the language runtime assert that the object is
         locked whenever it is accessed (this guarantees the safety in a more
         useful way, the casts are really annying). I can't imagine a
         simpler and
         more immediately useful solution.


     How about implementing a library function, something like this:

     shared int i;

     lock(i, (x) {
          // operate on x
     });

     * "lock" will acquire a lock
     * Cast away shared for "i"
     * Call the delegate with the now plain "int"
     * Release the lock

     http://pastebin.com/tfQ12nJB


 Interesting concept. Nice idea, could certainly be useful, but it
 doesn't address the problem as directly as my suggestion.
 There are still many problem situations, for instance, any time a
 template is involved. The template doesn't know to do that internally,
 but under my proposal, you lock it prior to the workload, and then the
 template works as expected. Templates won't just break and fail whenever
 shared is involved, because assignments would be legal. They'll just
 assert that the thing is locked at the time, which is the programmers
 responsibility to ensure.
I managed to make a simple example that works with the current implementation: http://dpaste.dzfl.pl/27b6df62 http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=4#post-k7s0gs:241h45:241:40digitalmars.com It seems to me that solving this shared issue cannot be done purely on a compiler basis but will require a runtime support. Actually I don't see how it can be done properly without telling "this lock must be locked when accessing this variable". http://dpaste.dzfl.pl/edbd3e10
Nov 15 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-11-15 11:52, Manu wrote:

 Interesting concept. Nice idea, could certainly be useful, but it
 doesn't address the problem as directly as my suggestion.
 There are still many problem situations, for instance, any time a
 template is involved. The template doesn't know to do that internally,
 but under my proposal, you lock it prior to the workload, and then the
 template works as expected. Templates won't just break and fail whenever
 shared is involved, because assignments would be legal. They'll just
 assert that the thing is locked at the time, which is the programmers
 responsibility to ensure.
I don't understand how a template would cause problems. -- /Jacob Carlborg
Nov 15 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, November 15, 2012 11:22:30 Manu wrote:
 Not to repeat my prev post... but in reply to Walter's take on it, it would
 be interesting if 'shared' just added implicit lock()/unlock() methods to
 do the mutex acquisition and then remove the cast requirement, but have the
 language runtime assert that the object is locked whenever it is accessed
 (this guarantees the safety in a more useful way, the casts are really
 annying). I can't imagine a simpler and more immediately useful solution.
 
 In fact, it's a reasonably small step to this being possible with
 user-defined attributes. Although attributes have no current mechanism to
 add a mutex, and lock/unlock methods to the object being attributed (like

1. It wouldn't stop you from needing to cast away shared at all, because without casting away shared, you wouldn't be able to pass it to anything, because the types would differ. Even if you were arguing that doing something like void foo(C c) {...} shared c = new C; foo(c); //no cast required, lock automatically taken it wouldn't work, because then foo could wile away a reference to c somewhere, and the type system would have no way of knowing that it was a shared variable that was being wiled away as opposed to a thread-local one, which means that it'll likely generate incorrect code. That can happen with the cast as well, but at least in that case, you're forced to be explicit about it, and it's automatically system. If it's done for you, it'll be easy to miss and screw up. 2. It's often the case that you need to lock/unlock groups of stuff together such that locking specific variables is of often of limited use and would just introduce pointless extra locks when dealing with multiple variables. It would also increase the risk of deadlocks, because you wouldn't have much - if any - control over what order locks were acquired in when dealing with multiple shared variables. - Jonathan M Davis
Nov 15 2012
prev sibling parent Manu <turkeyman gmail.com> writes:
On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Thursday, November 15, 2012 11:22:30 Manu wrote:
 Not to repeat my prev post... but in reply to Walter's take on it, it
would
 be interesting if 'shared' just added implicit lock()/unlock() methods to
 do the mutex acquisition and then remove the cast requirement, but have
the
 language runtime assert that the object is locked whenever it is accessed
 (this guarantees the safety in a more useful way, the casts are really
 annying). I can't imagine a simpler and more immediately useful solution.

 In fact, it's a reasonably small step to this being possible with
 user-defined attributes. Although attributes have no current mechanism to
 add a mutex, and lock/unlock methods to the object being attributed (like

1. It wouldn't stop you from needing to cast away shared at all, because without casting away shared, you wouldn't be able to pass it to anything, because the types would differ. Even if you were arguing that doing something like void foo(C c) {...} shared c = new C; foo(c); //no cast required, lock automatically taken it wouldn't work, because then foo could wile away a reference to c somewhere, and the type system would have no way of knowing that it was a shared variable that was being wiled away as opposed to a thread-local one, which means that it'll likely generate incorrect code. That can happen with the cast as well, but at least in that case, you're forced to be explicit about it, and it's automatically system. If it's done for you, it'll be easy to miss and screw up.
I don't really see the difference, other than, as you say, the cast is explicit. Obviously the possibility for the situation you describe exists, it's equally possible with the cast, except this way, the usage pattern is made more convenient, the user has a convenient way to control the locks and most importantly, it would work with templates. That said, this sounds like another perfect application of 'scope'. Perhaps only scope parameters can receive a locked, shared thing... that would mechanically protect you against escape. 2. It's often the case that you need to lock/unlock groups of stuff together
 such that locking specific variables is of often of limited use and would
 just
 introduce pointless extra locks when dealing with multiple variables. It
 would
 also increase the risk of deadlocks, because you wouldn't have much - if
 any -
 control over what order locks were acquired in when dealing with multiple
 shared variables.
Your fear is precisely the state we're in now, except it puts all the work on the user to create and use the synchronisation objects, and also to assert that things are locked when they are accessed. I'm just suggesting some reasonably simple change that would make the situation more usable and safer immediately, short of waiting for all these fantastic designs being discussed having time to simmer and manifest. Perhaps a usage mechanism could be more like: shared int x, y, z; synchronised with(x, y, z) { // do work with x, y, z, all locked together. }
Nov 15 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> =
wrote:
=20
 To make a shared type work in an algorithm, you have to:
=20
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what happens if you pass a reference to the now non-shared object to = a function that caches a local reference to it? Half the point of the = attribute is to protect us from accidents like this.=
Nov 15 2012
parent reply "Jason House" <jason.james.house gmail.com> writes:
On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright 
 <newshound2 digitalmars.com> wrote:
 
 To make a shared type work in an algorithm, you have to:
 
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.
Nov 17 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 17/11/2012 05:49, Jason House a écrit :
 On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.
Nothing is safe if ownership cannot be statically proven. This is completely useless.
Nov 18 2012
next sibling parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Am 19.11.2012 05:57, schrieb deadalnix:
 Le 17/11/2012 05:49, Jason House a écrit :
 On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.
Nothing is safe if ownership cannot be statically proven. This is completely useless.
But you can at least prove ownership under some limited circumstances. Limited, but (without having tested on a large scale) still practical. Interest seems to be limited much more than those circumstances, but anyway: http://forum.dlang.org/thread/k831b6$1368$1 digitalmars.com (the same approach that I already posted in this thread, but in a state that should be more or less bullet proof)
Nov 19 2012
prev sibling parent "Jason House" <jason.james.house gmail.com> writes:
On Monday, 19 November 2012 at 04:57:16 UTC, deadalnix wrote:
 Le 17/11/2012 05:49, Jason House a écrit :
 On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly 
 wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
So what happens if you pass a reference to the now non-shared object to a function that caches a local reference to it? Half the point of the attribute is to protect us from accidents like this.
The constructive thing to do may be to try and figure out what should users be allowed to do with locked shared data... I think the basic idea is that no references can be escaped; SafeD rules could probably help with that. Non-shared member functions might also need to be tagged with their ability to be called on locked, shared data.
Nothing is safe if ownership cannot be statically proven. This is completely useless.
Bartosz's design was very explicit about ownership, but was deemed too complex for D2. Shared was kept simple, but underpowered. Here's what I remember of Bartosz's design: - Shared object members are owned by the enclosung container unless explicitly marked otherwise - lockfree shared data is marked differently - Non-lockfree shared objects required locking them prior to access, but did not require separate shared and non-shared code. - No sequential consistency I really liked his design, but I think the explicit ownership part was considered too complex. There may still be something that can be done to improve D2, but I doubt it'd be a complete solution.
Nov 20 2012
prev sibling next sibling parent reply =?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:
Am 11.11.2012 19:46, schrieb Alex Rønne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out an
 alternative system to shared, because I don't see shared actually being
 useful for real world work no matter what we do with it.
 
After reading Walter's comment, it suddenly seemed obvious that we are currently using 'shared' the wrong way. Shared is just not meant to be used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statically checked library based solution and a nice way to use shared is to only use it for disabling access to non-shared members while its monitor is not locked. A ScopedLock proxy and a lock() function can be used for this: --- class MyClass { void method(); } void main() { auto inst = new shared(MyClass); //inst.method(); // forbidden { ScopedLock!MyClass l = lock(inst); l.method(); // now allowed as long as 'l' is in scope } // can also be called like this: inst.lock().method(); } --- ScopedLock is non-copyable and handles the dirty details of locking and casting away 'shared' when its safe to do so. No tagging of the class with 'synchronized' or 'shared' needs to be done and everything works nicely without casts. This comes with a restriction, though. Doing all this is only safe as long as the instance is known to not contain any unisolated aliasing*. So use would be restricted to types that contain only immutable or unique/isolated references. So I also implemented an Isolated!(T) type that is recognized by ScopedLock, as well as functions such as spawn(). The resulting usage can be seen in the example at the bottom. It doesn't provide all the flexibility that a built-in 'isolated' type would do, but the possible use cases at least look interesting. There are still some details to be worked out, such as writing a spawn() function that correctly moves Isolated!() parameters instead of copying or the forward reference error mentioned in the example. I'll now try and see if some of my earlier multi-threading designs fit into this system. --- import std.stdio; import std.typecons; import std.traits; import stdx.typecons; class SomeClass { } class Test { private { string m_test1 = "test 1"; Isolated!SomeClass m_isolatedReference; // currently causes a size forward reference error: //Isolated!Test m_next; } this() { //m_next = ...; } void test1() const { writefln(m_test1); } void test2() const { writefln("test 2"); } } void main() { writefln("Shared locking"); // create a shared instance of Test - no members will // be accessible auto t = new shared(Test); { // temporarily lock t to make all non-shared members // safely available // lock() words only for objects with no unisolated // aliasing. ScopedLock!Test l = lock(t); l.test1(); l.test2(); } // passing a shared object to a different thread works as usual writefln("Shared spawn"); spawn(&myThreadFunc1, t); // create an isolated instance of Test // currently, Test may not contain unisolated aliasing, but // this requirement may get lifted, // as long as only pure methods are called Isolated!Test u = makeIsolated!Test(); // move ownership to a different function and recover writefln("Moving unique"); Isolated!Test v = myThreadFunc2(u.move()); // moving to a different thread also works writefln("Moving unique spawn"); spawn(&myThreadFunc2, v.move()); // another possibility is to convert to immutable auto w = makeIsolated!Test(); writefln("Convert to immutable spawn"); spawn(&myThreadFunc3, w.freeze()); // or just loose the isolation and act on the base type writefln("Convert to mutable"); auto x = makeIsolated!Test(); Test xm = x.extract(); xm.test1(); xm.test2(); } void myThreadFunc1(shared(Test) t) { // call non-shared method on shared object t.lock().test1(); t.lock().test2(); } Isolated!Test myThreadFunc2(Isolated!Test t) { // call methods as usual on an isolated object t.test1(); t.test2(); return t.move(); } void myThreadFunc3(immutable(Test) t) { t.test1(); t.test2(); } // fake spawn function just to test the type constraints void spawn(R, ARGS...)(R function(ARGS) func, ARGS args) { foreach( i, T; ARGS ) static assert(!hasUnisolatedAliasing!T || !hasUnsharedAliasing!T, "Parameter "~to!string(i)~" of type" ~T.stringof~" has unshared or unisolated aliasing. Cannot safely be passed to a different thread."); // TODO: do this in a different thread... // TODO: don't cheat with the 1-parameter move detection static if(__traits(compiles, func(args[0])) ) func(args); else func(args[0].move()); } --- * shared aliasing would also be OK, but this is not yet handled by the implementation.
Nov 12 2012
next sibling parent reply "Regan Heath" <regan netmail.co.nz> writes:
On Mon, 12 Nov 2012 11:41:00 -0000, S=F6nke Ludwig  =

<sludwig outerproduct.org> wrote:

 Am 11.11.2012 19:46, schrieb Alex R=F8nne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear tha=
t
 shared is for documentation purposes and nothing else, or, figure out=
an
 alternative system to shared, because I don't see shared actually bei=
ng
 useful for real world work no matter what we do with it.
After reading Walter's comment, it suddenly seemed obvious that we are=
 currently using 'shared' the wrong way. Shared is just not meant to be=
 used on objects at all (or only in some special cases like
 synchronization primitives). I just experimented a bit with a statical=
ly
 checked library based solution and a nice way to use shared is to only=
 use it for disabling access to non-shared members while its monitor is=
 not locked. A ScopedLock proxy and a lock() function can be used for  =
 this:
I had exactly the same idea: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos= t-op.wnnsrds954xghj:40puck.auriga.bhead.co.uk But, then I went right back the other way: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos= t-op.wnnt4iyz54xghj:40puck.auriga.bhead.co.uk I think we can definitely create a library solution like the one you = propose below, and it should work quite well. But, I reckon it would be= = even nicer if the compiler did just a little bit of the work for us, and= = we integrated with the built in synchronized statement. :) R -- = Using Opera's revolutionary email client: http://www.opera.com/mail/
Nov 12 2012
parent =?ISO-8859-15?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:
Am 12.11.2012 13:33, schrieb Regan Heath:
 On Mon, 12 Nov 2012 11:41:00 -0000, Sönke Ludwig
 <sludwig outerproduct.org> wrote:
 
 Am 11.11.2012 19:46, schrieb Alex Rønne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out an
 alternative system to shared, because I don't see shared actually being
 useful for real world work no matter what we do with it.
After reading Walter's comment, it suddenly seemed obvious that we are currently using 'shared' the wrong way. Shared is just not meant to be used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statically checked library based solution and a nice way to use shared is to only use it for disabling access to non-shared members while its monitor is not locked. A ScopedLock proxy and a lock() function can be used for this:
I had exactly the same idea: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=2#post-op.wnnsrds954xghj:40puck.auriga.bhead.co.uk But, then I went right back the other way: http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=2#post-op.wnnt4iyz54xghj:40puck.auriga.bhead.co.uk I think we can definitely create a library solution like the one you propose below, and it should work quite well. But, I reckon it would be even nicer if the compiler did just a little bit of the work for us, and we integrated with the built in synchronized statement. :) R
The only problem is that for this approach to be safe, any aliasing outside of the object's reference tree that is not 'shared', must be disallowed. To get the maximum use out of this, some kind of 'isolated'/'unique' qualifier is needed again. So a built-in language solution - which would definitely be highly desirable - that allows this would also either have to introduce a new type qualifier, or recognize the corresponding library structure which implements this. Since for various reasons both possibilities have a questionable probability of being implemented, I decided to go and see what can be done with the current state. By now I would be more than happy to have _any_ decent solution that works and that can also be recommend to others.
Nov 12 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 12/11/2012 12:41, Sönke Ludwig a écrit :
 Am 11.11.2012 19:46, schrieb Alex Rønne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out an
 alternative system to shared, because I don't see shared actually being
 useful for real world work no matter what we do with it.
After reading Walter's comment, it suddenly seemed obvious that we are currently using 'shared' the wrong way. Shared is just not meant to be used on objects at all (or only in some special cases like synchronization primitives). I just experimented a bit with a statically checked library based solution and a nice way to use shared is to only use it for disabling access to non-shared members while its monitor is not locked. A ScopedLock proxy and a lock() function can be used for this: --- class MyClass { void method(); } void main() { auto inst = new shared(MyClass); //inst.method(); // forbidden { ScopedLock!MyClass l = lock(inst); l.method(); // now allowed as long as 'l' is in scope } // can also be called like this: inst.lock().method(); } --- ScopedLock is non-copyable and handles the dirty details of locking and casting away 'shared' when its safe to do so. No tagging of the class with 'synchronized' or 'shared' needs to be done and everything works nicely without casts. This comes with a restriction, though. Doing all this is only safe as long as the instance is known to not contain any unisolated aliasing*. So use would be restricted to types that contain only immutable or unique/isolated references. So I also implemented an Isolated!(T) type that is recognized by ScopedLock, as well as functions such as spawn(). The resulting usage can be seen in the example at the bottom. It doesn't provide all the flexibility that a built-in 'isolated' type would do, but the possible use cases at least look interesting. There are still some details to be worked out, such as writing a spawn() function that correctly moves Isolated!() parameters instead of copying or the forward reference error mentioned in the example. I'll now try and see if some of my earlier multi-threading designs fit into this system. --- import std.stdio; import std.typecons; import std.traits; import stdx.typecons; class SomeClass { } class Test { private { string m_test1 = "test 1"; Isolated!SomeClass m_isolatedReference; // currently causes a size forward reference error: //Isolated!Test m_next; } this() { //m_next = ...; } void test1() const { writefln(m_test1); } void test2() const { writefln("test 2"); } } void main() { writefln("Shared locking"); // create a shared instance of Test - no members will // be accessible auto t = new shared(Test); { // temporarily lock t to make all non-shared members // safely available // lock() words only for objects with no unisolated // aliasing. ScopedLock!Test l = lock(t); l.test1(); l.test2(); } // passing a shared object to a different thread works as usual writefln("Shared spawn"); spawn(&myThreadFunc1, t); // create an isolated instance of Test // currently, Test may not contain unisolated aliasing, but // this requirement may get lifted, // as long as only pure methods are called Isolated!Test u = makeIsolated!Test(); // move ownership to a different function and recover writefln("Moving unique"); Isolated!Test v = myThreadFunc2(u.move()); // moving to a different thread also works writefln("Moving unique spawn"); spawn(&myThreadFunc2, v.move()); // another possibility is to convert to immutable auto w = makeIsolated!Test(); writefln("Convert to immutable spawn"); spawn(&myThreadFunc3, w.freeze()); // or just loose the isolation and act on the base type writefln("Convert to mutable"); auto x = makeIsolated!Test(); Test xm = x.extract(); xm.test1(); xm.test2(); } void myThreadFunc1(shared(Test) t) { // call non-shared method on shared object t.lock().test1(); t.lock().test2(); } Isolated!Test myThreadFunc2(Isolated!Test t) { // call methods as usual on an isolated object t.test1(); t.test2(); return t.move(); } void myThreadFunc3(immutable(Test) t) { t.test1(); t.test2(); } // fake spawn function just to test the type constraints void spawn(R, ARGS...)(R function(ARGS) func, ARGS args) { foreach( i, T; ARGS ) static assert(!hasUnisolatedAliasing!T || !hasUnsharedAliasing!T, "Parameter "~to!string(i)~" of type" ~T.stringof~" has unshared or unisolated aliasing. Cannot safely be passed to a different thread."); // TODO: do this in a different thread... // TODO: don't cheat with the 1-parameter move detection static if(__traits(compiles, func(args[0])) ) func(args); else func(args[0].move()); } --- * shared aliasing would also be OK, but this is not yet handled by the implementation.
With some kind of ownership in the type system, it can me made automagic that shared is casted away on synchronized object.
Nov 12 2012
parent reply =?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:
Am 12.11.2012 14:00, schrieb deadalnix:
 
 With some kind of ownership in the type system, it can me made automagic
 that shared is casted away on synchronized object.
Yes and I would love to have that, but I fear that we then basically get where Bartosz Milewski was at the end of his research. And unfortunately that went too far to be considered for (mid-term) inclusion. Besides its shortcomings, there are also actually some advantages to a library based solution. For example it could be allowed to customize the lock()/unlock() function so that locking could work for fiber-aware mutexes (e.g. http://vibed.org/api/vibe.core.mutex/ ...) or even for network based distributed object systems.
Nov 12 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 12/11/2012 14:23, Sönke Ludwig a écrit :
 Am 12.11.2012 14:00, schrieb deadalnix:
 With some kind of ownership in the type system, it can me made automagic
 that shared is casted away on synchronized object.
Yes and I would love to have that, but I fear that we then basically get where Bartosz Milewski was at the end of his research. And unfortunately that went too far to be considered for (mid-term) inclusion. Besides its shortcomings, there are also actually some advantages to a library based solution. For example it could be allowed to customize the lock()/unlock() function so that locking could work for fiber-aware mutexes (e.g. http://vibed.org/api/vibe.core.mutex/ ...) or even for network based distributed object systems.
Don't get me started on fibers /D
Nov 12 2012
prev sibling parent reply =?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:
I generated some quick documentation with examples here:

http://vibed.org/temp/d-isolated-test/stdx/typecons/lock.html
http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolated.html
http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolatedArray.html

It does offer some nice improvements. No single cast and everything is
statically checked.
Nov 12 2012
parent =?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:
Am 12.11.2012 16:27, schrieb Sönke Ludwig:
 I generated some quick documentation with examples here:
 
 http://vibed.org/temp/d-isolated-test/stdx/typecons/lock.html
 http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolated.html
 http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolatedArray.html
 
 It does offer some nice improvements. No single cast and everything is
statically checked.
 
All examples compile now. Put everything on github for reference: https://github.com/s-ludwig/d-isolated-test
Nov 12 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, November 15, 2012 04:12:47 Andrej Mitrovic wrote:
 On 11/15/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 From what I recall of what TDPL says
It says (on p.413) reading and writing shared values are guaranteed to be atomic, for pointers, arrays, function pointers, delegates, class references, and struct types containing exactly one of these types. Reals are not supported. It also talks about automatically inserting memory barriers on page 414.
Good to know, but none of that really has anything to do with the casting, which is what I was responding to. And looking at that list, it sounds reasonable that all of that would be guaranteed to be atomic, but I think that the fundamental problem that's affecting usability is all of the casting that's typically required. And I don't see any way around that other than writing code that doesn't need to pass shared objects around or using templates very heavily. - Jonathan M Davis
Nov 14 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, November 15, 2012 14:32:47 Manu wrote:
 On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 I don't really see the difference, other than, as you say, the cast is
 explicit.
 Obviously the possibility for the situation you describe exists, it's
 equally possible with the cast, except this way, the usage pattern is made
 more convenient, the user has a convenient way to control the locks and
 most importantly, it would work with templates.
 That said, this sounds like another perfect application of 'scope'. Perhaps
 only scope parameters can receive a locked, shared thing... that would
 mechanically protect you against escape.
You could make casting away const implicit too, which would make some code easier, but it would be a disaster, because the programer wouldn't have a clue that it's happening in many cases, and the code would end up being very, very wrong. Implicitly casting away shared would put you in the same boat. _Maybe_ you could get away with it in very restricted circumstances where both pure and scope are being used, but then it becomes so restrictive that it's nearly useless anyway. And again, it would be hidden from the programmer, when this is something that _needs_ to be explicit. Having implicit locks happen on you could really screw with any code trying to do explicit locks, as would be needed anyway in all but the most basic cases.
 2. It's often the case that you need to lock/unlock groups of stuff together
 such that locking specific variables is of often of limited use and would
 just
 introduce pointless extra locks when dealing with multiple variables. It
 would
 also increase the risk of deadlocks, because you wouldn't have much - if
 any -
 control over what order locks were acquired in when dealing with multiple
 shared variables.
Your fear is precisely the state we're in now, except it puts all the work on the user to create and use the synchronisation objects, and also to assert that things are locked when they are accessed. I'm just suggesting some reasonably simple change that would make the situation more usable and safer immediately, short of waiting for all these fantastic designs being discussed having time to simmer and manifest.
Except that with your suggestion, you're introducing potential deadlocks which are outside of the programmer's control, and you're introducing extra overhead with those locks (both in terms of memory and in terms of the runtime costs). Not to mention, it would probably cause all kinds of issues for something like shared int* to have a mutex with it, because then its size is completely different from int*. It also would cause even worse problems when that shared int* was cast to int* (aside from the size issues), because all of the locking that was happening for the shared int* was invisible. If you want automatic locks, then use synchronized classes. That's what they're for. Honestly, I really don't buy into the idea that it makes sense for shared to magically make multi-threaded code work without the programmer worrying about locks. Making it so that it's well-defined as to what's atomic is great for code that has any chance of being lock-free, but it's still up to the programmer to understand when locks are and aren't needed and how to use them correctly. I don't think that it can possibly work for it to be automatic. It's far to easy to introduce deadlocks, and it would only work in the simplest of cases anyway, meaning that the programmer needs to understand and properly solve the issues anyway. And if the programmer has to understand it all to get it right, why bother adding the extra overhead and deadlock potential caused by automatically locking anything? D provides some great synchronization primitives. People should use them. I think that the only things that share really needs to be solving are: 1. Indicating to the compiler via the type system that the object is not thread-local. This properly segregates shared and unshared code and allows the compiler to take advantage of thread locality for optimizations and avoid optimizations with shared code that screw up threading (e.g. double-checked locking won't work if the compiler does certain optimizations). 2. Making it explicit and well-defined as part of the language which operations can assumed to be atomic (even if it that set of operations is very small, having it be well-defined is valuable). 3. Ensuring sequential consistency so that it's possible to do lock-free code when atomic operations permit it and so that there are fewer weird issues due to undefined behavior. - Jonathan M Davis
Nov 15 2012
prev sibling next sibling parent Manu <turkeyman gmail.com> writes:
On 15 November 2012 15:00, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Thursday, November 15, 2012 14:32:47 Manu wrote:
 On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 I don't really see the difference, other than, as you say, the cast is
 explicit.
 Obviously the possibility for the situation you describe exists, it's
 equally possible with the cast, except this way, the usage pattern is
made
 more convenient, the user has a convenient way to control the locks and
 most importantly, it would work with templates.
 That said, this sounds like another perfect application of 'scope'.
Perhaps
 only scope parameters can receive a locked, shared thing... that would
 mechanically protect you against escape.
You could make casting away const implicit too, which would make some code easier, but it would be a disaster, because the programer wouldn't have a clue that it's happening in many cases, and the code would end up being very, very wrong. Implicitly casting away shared would put you in the same boat.
... no, they're not even the same thing. const things can not be changed. Shared things are still mutable things, and perfectly compatible with other non-shared mutable things, they just have some access control requirements. _Maybe_ you could get away with it in very restricted circumstances where
 both pure
 and scope are being used, but then it becomes so restrictive that it's
 nearly
 useless anyway. And again, it would be hidden from the programmer, when
 this
 is something that _needs_ to be explicit. Having implicit locks happen on
 you
 could really screw with any code trying to do explicit locks, as would be
 needed anyway in all but the most basic cases.
I think you must have misunderstood my suggestion, I certainly didn't suggest locking would be implicit. All locks would be explicit, all I suggested is that shared things would gain an associated mutex, and an implicit assert that said mutex is locked whenever it is accessed, rather than deny assignment between shared/unshared things. You could use lock methods, or a nice alternative would be to submit them to some sort of synchronised scope like luka illustrates. I'm of the opinion that for the time being, explicit lock control is mandatory (anything else is a distant dream), and atomic primitives may not be relied upon.
 2. It's often the case that you need to lock/unlock groups of stuff
 together
 such that locking specific variables is of often of limited use and
would
 just
 introduce pointless extra locks when dealing with multiple variables.
It
 would
 also increase the risk of deadlocks, because you wouldn't have much -
if
 any -
 control over what order locks were acquired in when dealing with
multiple
 shared variables.
Your fear is precisely the state we're in now, except it puts all the
work
 on the user to create and use the synchronisation objects, and also to
 assert that things are locked when they are accessed.
 I'm just suggesting some reasonably simple change that would make the
 situation more usable and safer immediately, short of waiting for all
these
 fantastic designs being discussed having time to simmer and manifest.
Except that with your suggestion, you're introducing potential deadlocks which are outside of the programmer's control, and you're introducing extra overhead with those locks (both in terms of memory and in terms of the runtime costs). Not to mention, it would probably cause all kinds of issues for something like shared int* to have a mutex with it, because then its size is completely different from int*. It also would cause even worse problems when that shared int* was cast to int* (aside from the size issues), because all of the locking that was happening for the shared int* was invisible. If you want automatic locks, then use synchronized classes. That's what they're for. Honestly, I really don't buy into the idea that it makes sense for shared to magically make multi-threaded code work without the programmer worrying about locks. Making it so that it's well-defined as to what's atomic is great for code that has any chance of being lock-free, but it's still up to the programmer to understand when locks are and aren't needed and how to use them correctly. I don't think that it can possibly work for it to be automatic. It's far to easy to introduce deadlocks, and it would only work in the simplest of cases anyway, meaning that the programmer needs to understand and properly solve the issues anyway. And if the programmer has to understand it all to get it right, why bother adding the extra overhead and deadlock potential caused by automatically locking anything? D provides some great synchronization primitives. People should use them.
To all above: You've completely misunderstood my suggestion. It's basically the same as luka. It's not that hard, shared just assists the user do what they do anyway by associating a lock primitive, and implicitly assert it is locked when accessed. No magic should be performed on the users behalf. I think that the only things that share really needs to be solving are:
 1. Indicating to the compiler via the type system that the object is not
 thread-local. This properly segregates shared and unshared code and allows
 the
 compiler to take advantage of thread locality for optimizations and avoid
 optimizations with shared code that screw up threading (e.g. double-checked
 locking won't work if the compiler does certain optimizations).

 2. Making it explicit and well-defined as part of the language which
 operations
 can assumed to be atomic (even if it that set of operations is very small,
 having it be well-defined is valuable).

 3. Ensuring sequential consistency so that it's possible to do lock-free
 code
 when atomic operations permit it and so that there are fewer weird issues
 due
 to undefined behavior.

 - Jonathan M Davis
Nov 15 2012
prev sibling parent "Mehrdad" <wfunction hotmail.com> writes:
Would it be useful if 'shared' in D did something like 'volatile' 
in C++ (as in, Andrei's article on volatile-correctness)?
http://www.drdobbs.com/cpp/volatile-the-multithreaded-programmers-b/184403766
Nov 15 2012