digitalmars.D - Something needs to happen with shared, and soon.

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (71/71) Nov 11 2012 Hi,

Andrej Mitrovic (13/16) Nov 11 2012 I think most people probably don't even use shared due to lacking

Chris Nicholson-Sauls (37/60) Nov 11 2012 Fix support for shared(T) in std.variant, and you will have fixed

Benjamin Thaut (3/3) Nov 11 2012 Fully agree.

martin (2/3) Nov 11 2012 +1

Graham St Jack (4/8) Nov 11 2012 +1.

deadalnix (2/10) Nov 11 2012 That isn't a bad thing in itself.

Jonathan M Davis (50/63) Nov 11 2012 I don' think that it's really intended that shared by 100% easy to use.=

bearophile (5/6) Nov 11 2012 Maybe deprecate it and introduce something else that is rather
nixda (4/81) Nov 11 2012 drop it in favour of :
Michel Fortin (16/22) Nov 11 2012 I feel like the concurrency aspect of D2 was rushed in the haste of

Timon Gehr (2/19) Nov 13 2012 I am always irritated by shared-by-default static variables.

Michel Fortin (8/23) Nov 13 2012 I tend to have very little global state in my code, so

Jonathan M Davis (5/8) Nov 13 2012 Thread-local by default is a _huge_ step forward, and in hindsight, it s...
Timon Gehr (14/35) Nov 14 2012 So do I. A thread-local static variable does not imply global state.

Michel Fortin (17/54) Nov 14 2012 I'd consider that poor style. Use a struct to encapsulate the state,

Timon Gehr (12/60) Nov 14 2012 I'd consider this a poor statement to make. Universally quantified

Michel Fortin (13/37) Nov 14 2012 Indeed. There's not enough context to judge fairly. I can accept the

Walter Bright (25/28) Nov 11 2012 I think a couple things are clear:

Benjamin Thaut (5/5) Nov 11 2012 The only problem beeing that you can not really have user defined shared...

Walter Bright (4/7) Nov 11 2012 If you include an object designed to work only in a single thread (non-s...

Benjamin Thaut (11/21) Nov 12 2012 I'm not talking about objects, I'm talking about value types.

Johannes Pfau (101/111) Nov 12 2012 But there are also shared member functions and they're kind of annoying

Walter Bright (20/120) Nov 12 2012 You can't get away from the fact that data that can be accessed from mul...

luka8088 (12/151) Nov 12 2012 If I understood correctly there is no reason why this should not compile...

deadalnix (3/14) Nov 12 2012 D has no ownership, so the compiler can't know what

luka8088 (147/166) Nov 12 2012 Here i as wild idea:

Johannes Pfau (53/113) Nov 12 2012 I know share can't automatically make the code thread safe. I

Sean Kelly (10/63) Nov 14 2012 Yes. You end up having two methods for each function, one as a =

Regan Heath (24/30) Nov 12 2012 So what we actually want, in order to make the above "nice" is a "scoped...

Regan Heath (29/57) Nov 12 2012 There was talk a while back about how to handle the existing object mute...

deadalnix (5/25) Nov 12 2012 As already explain in the thread you mention, it is not gonna work. The

Jacob Carlborg (4/32) Nov 12 2012 I'm just throwing it in here again, AST macros could probably solve this...

Simen Kjaeraas (7/41) Nov 12 2012 Until someone writes a proper DIP on them, macros can write entire softw...

Jacob Carlborg (4/8) Nov 12 2012 Sure, I can try and stop doing that :)

FeepingCreature (3/12) Nov 13 2012 You know, AST macros could probably stop doing that.

deadalnix (10/41) Nov 12 2012 The compiler is able to do some optimization on that, and, it never
Manu (26/58) Nov 12 2012 s
luka8088 (13/44) Nov 13 2012 This clarifies a lot, but still a lot of people get confused with:

luka8088 (8/78) Nov 13 2012 Um, sorry, the following code:

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (3/73) Nov 13 2012 Only std.concurrency (using spawn() and send()) enforces that unshared d...

luka8088 (4/77) Nov 13 2012 In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is...

David Nadlinger (7/15) Nov 13 2012 You are right, it could probably be added to avoid confusion. But

Sean Kelly (24/81) Nov 14 2012 that

luka8088 (5/72) Nov 14 2012 Yes, that makes perfect sense... I just wanted to point out the

Walter Bright (4/16) Nov 13 2012 Andrei is a proponent of having shared to memory barriers, I disagree wi...

Peter Alexander (4/12) Nov 13 2012 FWIW, I'm with you on this one. Memory barriers would not make
Andrei Alexandrescu (4/11) Nov 13 2012 Wait, then what would shared do? This is new to me as I've always

Peter Alexander (10/25) Nov 13 2012 I'm speaking out of turn, but...

Andrei Alexandrescu (8/30) Nov 13 2012 Oh ok, thanks. That does make sense. There's been quite a bit of
deadalnix (2/24) Nov 13 2012 It cannot unless some ownership is introduced in D.

Walter Bright (18/29) Nov 13 2012 I'm just not convinced that having the compiler add memory barriers:

Andrei Alexandrescu (13/48) Nov 13 2012 I'm fine with these arguments. We'll need to break current uses of

David Nadlinger (5/9) Nov 13 2012 You mean x.store(4)? Or am I completely misunderstanding your

Andrei Alexandrescu (3/11) Nov 13 2012 Apologies, yes, store.

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/58) Nov 13 2012 Is that meant to be an atomic store, or just a regular, but explicit, st...

Andrei Alexandrescu (3/11) Nov 13 2012 Atomic and sequentially consistent.

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (27/39) Nov 13 2012 OK, but then we have the problem I presented in the OP: This only works

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (19/57) Nov 13 2012 Scratch that, make it this:

deadalnix (2/60) Nov 13 2012 That list sound more reasonable.
Andrei Alexandrescu (7/12) Nov 13 2012 When I wrote TDPL I looked at the contemporary architectures and it

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (11/23) Nov 13 2012 I do not know of a single architecture apart from x86 that supports >

Andrei Alexandrescu (5/26) Nov 13 2012 Intel does 128-bit atomic load and store, see

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/34) Nov 13 2012 That's Itanium, though, not x86. Itanium is a fairly high-end,

Rainer Schuetze (4/41) Nov 14 2012 On x86 you can use LOCK CMPXCHG16b to do the atomic read:

deadalnix (4/42) Nov 13 2012 I wouldn't expected it to work for delegates, long, ulong, double and

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (6/53) Nov 13 2012 8-byte atomic loads/stores is doable on all major architectures.

Andrei Alexandrescu (3/4) Nov 13 2012 We're looking at 128-bit load, store, and CAS for 64-bit machines.

Walter Bright (4/16) Nov 13 2012 Not going to portably work on long, ulong, double, slices, or delegates.

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (9/29) Nov 13 2012 I amended that (see my other post). 8-byte loads/stores can be done

deadalnix (2/29) Nov 13 2012 http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (7/38) Nov 13 2012 Thanks, exactly that. No MIPS, though. I guess I'm going to have to go

Walter Bright (2/4) Nov 13 2012 Our car doesn't have an electric starter yet, but it's still better than...

Andrei Alexandrescu (5/11) Nov 13 2012 Please don't. This is "we're doing better than C++" in disguise and

Jonathan M Davis (23/25) Nov 13 2012 At this point, I don't see how it could be otherwise. Having the shared

deadalnix (15/35) Nov 13 2012 That is what java's volatile do. It have several uses cases, including

Walter Bright (10/19) Nov 13 2012 Please, please file a bug report about this, rather than a vague stateme...

deadalnix (3/28) Nov 13 2012 Why would you destroy something that isn't dead yet ?

David Nadlinger (10/16) Nov 14 2012 What stops you from using core.atomic.{atomicLoad, atomicStore}?

deadalnix (5/20) Nov 14 2012 It is a solution now (it wasn't at the time).

David Nadlinger (6/9) Nov 14 2012 You mean moving non-atomic loads/stores across atomic

Andrei Alexandrescu (8/23) Nov 14 2012 This is a simplification of what should be going on. The

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (15/40) Nov 14 2012 They already work as they should:

Andrei Alexandrescu (5/43) Nov 14 2012 The language definition should be made clear so as future optimizations

David Nadlinger (17/52) Nov 14 2012 Sorry, I don't quite see where I simplified things. Yes, in the

Andrei Alexandrescu (15/58) Nov 14 2012 First, there are more kinds of atomic loads and stores. Then, the fact

Sean Kelly (11/13) Nov 14 2012 that the calls are not supposed to be reordered must be a guarantee of =

Andrei Alexandrescu (4/8) Nov 14 2012 I think we should focus on sequential consistency as that's where the

Sean Kelly (9/11) Nov 14 2012 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =

deadalnix (3/7) Nov 15 2012 It is sufficient for monocore and mostly correct for x86. But isn't enou...

Sean Kelly (13/23) Nov 15 2012 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =

deadalnix (3/15) Nov 16 2012 I'm not aware of D1 compiler inserting memory barrier, so any memory

Sean Kelly (9/14) Nov 14 2012 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =

Jacob Carlborg (6/15) Nov 13 2012 If the compiler should/does not add memory barriers, then is there a

Walter Bright (2/4) Nov 14 2012 Memory barriers can certainly be added using library functions.

Jacob Carlborg (4/5) Nov 14 2012 Is there then any real advantage of having it directly in the language?

Walter Bright (2/5) Nov 14 2012 Not that I can think of.

Jacob Carlborg (5/6) Nov 14 2012 Then we might want to remove it since it's either not working or

Andrei Alexandrescu (3/7) Nov 14 2012 Actually this hypothesis is false.

Jacob Carlborg (6/7) Nov 14 2012 That we should remove it or that it's not working/nobody understands

Andrei Alexandrescu (3/8) Nov 14 2012 The hypothesis that atomic primitives can be implemented as a library.

Jacob Carlborg (4/5) Nov 14 2012 I don't know these kind of things, that's why I'm asking.

deadalnix (4/7) Nov 14 2012 The compiler can do more reordering in regard to barriers. For instance,...

Jacob Carlborg (4/7) Nov 14 2012 I see.

Andrei Alexandrescu (3/6) Nov 14 2012 It's not an advantage, it's a necessity.

Jacob Carlborg (6/7) Nov 14 2012 Walter seems to indicate that there is no technical reason for "shared"

Andrei Alexandrescu (6/12) Nov 14 2012 Walter is a self-confessed dilettante in threading. To be frank I hope

Jacob Carlborg (4/6) Nov 14 2012 Ok, thanks for the expatiation.

Andrei Alexandrescu (4/9) Nov 14 2012 The compiler must understand the semantics of barriers such as e.g. it

David Nadlinger (8/20) Nov 14 2012 Again, this is true, but it would be a fallacy to conclude that

Andrei Alexandrescu (3/19) Nov 14 2012 Compiler intrinsics ====== built into the language.

Iain Buclaw (13/41) Nov 14 2012 e:

Andrei Alexandrescu (3/42) Nov 14 2012 aware of what it is and what it does ====== built into the language.

Sean Kelly (5/14) Nov 14 2012 doesn't hoist code above an acquire barrier or below a release barrier.

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (10/21) Nov 14 2012 The volatile statement was too general. All relevant compiler back ends

Sean Kelly (21/38) Nov 14 2012 wrote:

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= (8/28) Nov 14 2012 Well, there's not much point in that when all compilers have intrinsics

Andrei Alexandrescu (3/14) Nov 14 2012 Because it's better to associate volatility with data than with code.

Sean Kelly (12/28) Nov 14 2012 Alexandrescu wrote:
deadalnix (5/24) Nov 15 2012 Happy to see I'm not alone on that one.

Sean Kelly (14/39) Nov 15 2012 enough?

Andrei Alexandrescu (3/16) Nov 14 2012 The compiler must be in this so as to not do certain reorderings.

Jonathan M Davis (12/19) Nov 13 2012 Being able to have double-checked locking work would be valuable, and ha...

Walter Bright (3/6) Nov 14 2012 I'm not saying "memory barriers are bad". I'm saying that having the com...

Andrei Alexandrescu (3/12) Nov 14 2012 Let's not hasten. That works for Java and C#, and is allowed in C++.

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (13/26) Nov 14 2012 I need some clarification here: By memory barrier, do you mean x86's

deadalnix (10/37) Nov 14 2012 In fact, x86 is mostly sequentially consistent due to its memory model.

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (11/52) Nov 14 2012 I just used x86's fencing instructions as an example because most people...

Andrei Alexandrescu (12/36) Nov 14 2012 Sorry, I was imprecise. We need to (a) define intrinsics for loading and...

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= (15/52) Nov 14 2012 Let's continue this part of the discussion in my other reply (the one
David Nadlinger (22/32) Nov 14 2012 Sorry, I didn't see this message of yours before replying (the

David Nadlinger (8/10) Nov 14 2012 Let my clarify that: We don't necessarily need to tuck on any
Andrei Alexandrescu (11/39) Nov 14 2012 Yah, the whole point here is that we need something IN THE LANGUAGE

Manu (38/85) Nov 15 2012 e:

deadalnix (2/7) Nov 15 2012 Can you elaborate on that ?
Andrei Alexandrescu (9/26) Nov 15 2012 All contemporary languages that are serious about concurrency support

Sean Kelly (8/17) Nov 15 2012 is
Manu (15/49) Nov 16 2012 I'm not conflating the 2, I'm suggesting to stick with the primitives th...

Pragma Tix (4/82) Nov 16 2012 Seems to me that Soenkes's library solution went into to right

Manu (14/81) Nov 16 2012 Looks reasonable to me, also Dmitry Olshansky and luka have both made

Pragma Tix (10/131) Nov 16 2012 Hi Manu,

Manu (10/34) Nov 16 2012 I can't resist... D may be serious about the *idea* of concurrency, but ...

David Nadlinger (8/20) Nov 15 2012 What are these special properties? Sorry, it seems like we are

Andrei Alexandrescu (4/11) Nov 15 2012 For example you can't hoist a memory operation before a shared load or

David Nadlinger (10/26) Nov 15 2012 Well, to be picky, that depends on what kind of memory operation

Sean Kelly (14/28) Nov 15 2012 wrote:

David Nadlinger (10/20) Nov 15 2012 Oh well, I was just being stupid when typing up my response: What

Sean Kelly (15/24) Nov 15 2012 mean =96 moving non-volatile loads/stores across volatile ones is =
Andrei Alexandrescu (3/21) Nov 15 2012 Shared must be sequentially consistent.

deadalnix (3/19) Nov 18 2012 If it is known that the memory read/write is thread local, this is safe,...

Andrei Alexandrescu (8/29) Nov 15 2012 In D that's fine (as long as in-thread SC is respected) because

Walter Bright (3/38) Nov 14 2012 Yes. And also, I agree that having something typed as "shared" must prev...

Andrei Alexandrescu (4/7) Nov 14 2012 It's the same issue at hand: ordering properly and inserting barriers

Sean Kelly (8/14) Nov 14 2012 are two ways to ensure one single goal, sequential consistency. Same =

Andrei Alexandrescu (4/18) Nov 14 2012 Yah, but the baseline here is acquire-release which has subtle

Sean Kelly (9/28) Nov 15 2012 differences that are all the more maddening.

deadalnix (4/53) Nov 15 2012 I'm sorry but that is dumb.

Sean Kelly (4/6) Nov 15 2012 load/stores if the CPU is allowed to do so ?

David Nadlinger (5/12) Nov 15 2012 I think the question was: Why would you want to disable compiler

Andrei Alexandrescu (4/15) Nov 15 2012 The compiler does whatever it takes to ensure sequential consistency for...

David Nadlinger (9/32) Nov 15 2012 How does this have anything to do with deadalnix' question that I

Sean Kelly (9/16) Nov 15 2012 such ability for the compiler right now.

Jacob Carlborg (6/16) Nov 14 2012 If there is a problem with efficiency in some cases then the developer

Benjamin Thaut (8/17) Nov 14 2012 I still don't agree with you there. The struct would have clearly

Walter Bright (5/11) Nov 14 2012 If you know this for a fact, then cast it to thread local. The compiler ...

Benjamin Thaut (18/32) Nov 14 2012 Could you please give an example where it would break?

Walter Bright (8/23) Nov 14 2012 Thread 1:

Benjamin Thaut (8/16) Nov 14 2012 But for passing a reference to a value type you would have to use a

Walter Bright (7/24) Nov 14 2012 1. You can't escape pointers in safe code (well, it's a bug if you do).

Benjamin Thaut (12/40) Nov 14 2012 So just to be clear, escaping pointers in a single threaded context is a...

Walter Bright (8/10) Nov 14 2012 I hate to repeat myself, but:

Andrei Alexandrescu (4/12) Nov 14 2012 That should be disallowed at least in safe code. If I had my way I'd
Jacob Carlborg (6/13) Nov 15 2012 Why would the object be destroyed if there's still a reference to it? If...

Jonathan M Davis (28/42) Nov 15 2012 Yeah. If the reference passed across were shared, then the runtime shoul...

Benjamin Thaut (10/37) Nov 15 2012 Thank you, thats exatcly how I'm thinking too. And because of this it

Dmitry Olshansky (15/26) Nov 15 2012 Ain't structs typically copied anyway?

Jonathan M Davis (5/23) Nov 14 2012 Pointers are not considered unsafe at all and are perfectly legal in Saf...

Jason House (12/18) Nov 14 2012 This is a fairly reasonable use of shared, but it is bypassing
Andrei Alexandrescu (5/10) Nov 14 2012 This is very different from how I view we should do things (and how we

Jonathan M Davis (19/31) Nov 14 2012 Well, this is clearly how things work now, and if you want to use shared...

Michel Fortin (87/90) Nov 14 2012 One thing I'm confused about right now is how people are using shared.

Regan Heath (47/109) Nov 15 2012 =

Sean Kelly (19/33) Nov 15 2012 http://forum.dlang.org/thread/k7orpj$1tt5$1@digitalmars.com?page=3D2#pos...

Dmitry Olshansky (38/48) Nov 15 2012 While the rest of proposal was more or less fine. I don't get why we

Michel Fortin (74/125) Nov 16 2012 In case you want to protect two variables (or more) with the same

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (9/26) Nov 16 2012 Can you have a look at my thread about this?
Sean Kelly (13/43) Nov 16 2012 said:

Michel Fortin (22/36) Nov 17 2012 Perhaps it's just my style of coding, but when designing a class that

Dmitry Olshansky (76/173) Nov 16 2012 Wrap in a struct and it would be even much clearer and safer.

Michel Fortin (29/107) Nov 17 2012 Clever. But you forgot to access the variable somewhere. What's its

Jacob Carlborg (5/12) Nov 17 2012 If a feature can be implemented in a library with the same syntax,
Dmitry Olshansky (17/39) Nov 17 2012 Not having the name would imply you can't escape it :) But I agree it's
foobar (7/27) Nov 19 2012

Michel Fortin (9/36) Nov 19 2012 No solution will be foolproof in the general case unless we add new

Andrej Mitrovic (6/7) Nov 14 2012 It says (on p.413) reading and writing shared values are guaranteed to
Jonathan M Davis (6/14) Nov 14 2012 Actually, I think that what it comes down to is that shared works nicely...

Andrei Alexandrescu (4/18) Nov 14 2012 TDPL 13.14 explains that inside synchronized classes, top-level shared

Jonathan M Davis (21/23) Nov 15 2012 Then it's doing the casting for you. I suppose that that's an argument t...
=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (6/27) Nov 15 2012 There are three problems I currently see with this:

Manu (17/27) Nov 15 2012 The pattern Walter describes is primitive and useful, I'd like to see

Jacob Carlborg (13/20) Nov 15 2012 How about implementing a library function, something like this:

Manu (9/28) Nov 15 2012 Interesting concept. Nice idea, could certainly be useful, but it doesn'...

luka8088 (10/41) Nov 15 2012 I managed to make a simple example that works with the current
Jacob Carlborg (4/13) Nov 15 2012 I don't understand how a template would cause problems.

Jonathan M Davis (22/33) Nov 15 2012 1. It wouldn't stop you from needing to cast away shared at all, because...
Manu (23/65) Nov 15 2012 I don't really see the difference, other than, as you say, the cast is

Sean Kelly (5/13) Nov 15 2012 So what happens if you pass a reference to the now non-shared object to ...

Jason House (7/21) Nov 17 2012 The constructive thing to do may be to try and figure out what

deadalnix (3/24) Nov 18 2012 Nothing is safe if ownership cannot be statically proven. This is

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= (7/32) Nov 19 2012 But you can at least prove ownership under some limited circumstances. L...
Jason House (15/47) Nov 20 2012 Bartosz's design was very explicit about ownership, but was

=?ISO-8859-1?Q?S=F6nke_Ludwig?= (141/148) Nov 12 2012 After reading Walter's comment, it suddenly seemed obvious that we are

Regan Heath (21/37) Nov 12 2012 t

=?ISO-8859-15?Q?S=F6nke_Ludwig?= (13/49) Nov 12 2012 The only problem is that for this approach to be safe, any aliasing

deadalnix (3/151) Nov 12 2012 With some kind of ownership in the type system, it can me made automagic...

=?ISO-8859-1?Q?S=F6nke_Ludwig?= (9/12) Nov 12 2012 Yes and I would love to have that, but I fear that we then basically get

deadalnix (2/14) Nov 12 2012 Don't get me started on fibers /D

=?ISO-8859-1?Q?S=F6nke_Ludwig?= (5/5) Nov 12 2012 I generated some quick documentation with examples here:

=?ISO-8859-1?Q?S=F6nke_Ludwig?= (3/11) Nov 12 2012 All examples compile now. Put everything on github for reference:

Jonathan M Davis (9/18) Nov 14 2012 Good to know, but none of that really has anything to do with the castin...
Jonathan M Davis (45/71) Nov 15 2012 You could make casting away const implicit too, which would make some co...
Manu (24/123) Nov 15 2012 ... no, they're not even the same thing. const things can not be changed...
Mehrdad (3/3) Nov 15 2012 Would it be useful if 'shared' in D did something like 'volatile'

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

Hi,

It's starting to get outright embarrassing to talk to newcomers about 
D's concurrency support because the most fundamental part of it -- the 
shared type qualifier -- does not have well-defined semantics at all.

I'm certainly not alone in being annoyed by this state of affairs: 
http://d.puremagic.com/issues/show_bug.cgi?id=8993

I've posted rants about the state of shared before and, from the 
comments on those, it appears that what most people want shared to do is 
at least one (and usually multiple) of

* make variables global (if appropriate in the context);
* make the wrapped type completely separate from the unwrapped type;
* make all operations be atomic;
* make all operations result in memory barriers.

At a glance, this looks fine. Exactly what you would want for shared 
types in a concurrent setting, right?

Except, not really. I'll try to explain all of the unsolved problems 
with shared below...

First of all, the fact that shared(T) is completely separate from T 
(i.e. no conversions allowed, except for primitive types) is a huge 
usability problem. In practice, it means that 99% of the standard 
library is unusable with shared types. Hell, even most of the runtime 
doesn't work with shared types. I don't know how to best solve this 
particular problem; I'm just pointing it out because anyone who tries to 
do anything non-trivial with shared will invariably run into this.

Second, the idea of making shared insert atomic operations is an 
absolute fallacy. It only makes sense for primitive types for the most 
part, and even for those, what sizes are supported depends on the target 
architecture. A number of ideas have come up to solve this problem:

* We make shared(T) not compile for certain Ts depending on the target 
architecture. I personally think this is a terrible idea because most 
code using shared will not be portable at all.
* We require any architecture D targets to support atomic operations for 
a certain size S at the very least. This is fine for primitives up to 64 
bits in size, but doesn't clear up the situation for larger types (real, 
complex types, cent/ucent, ...).
* We make shared not insert atomic operations at all (thus making it 
kind of useless for anything but documentation).
* (Possibly others I have forgotten; please let me know if this is the 
case.)

I don't think any of these are particularly attractive, to be honest. If 
we do make shared insert atomic operations, we would also have to 
consider the memory ordering of those operations.

Third, we have memory barriers. I strongly suspect that this is a 
misnomer in most cases where people have suggested this; it's generally 
not useful to have a compiler insert barriers because they are used to 
control ordering of load/store operations which is something the 
programmer will want to do explicitly. In any case, the compiler can't 
usefully figure out where to put barriers, so it would just result in 
really bad performance for no apparent gain.

Fourth, there is implementation complexity. If shared is meant to insert 
specialized instructions, it will result in effectively two code paths 
for most code generation in any D compiler (read: maintenance nightmare).

Fifth, it is completely unclear whether casting to and from shared is 
legal (but with a big fat "caution" sign like casting away const) or if 
it's undefined behavior. Making it undefined behavior would further 
increase the usability problem I described above.

And finally, the worst part of all of this? People writing code that 
uses shared today are blindly assuming it actually does the right thing. 
It doesn't. Their code will break on any non-x86 platform. This is an 
absolutely horrifying situation now that ARM, MIPS, and PowerPC are 
starting to become viable targets for D.

Something needs to be done about shared. I don't know what, but the 
current situation is -- and I'm really not exaggerating here -- 
laughable. I think we either need to just make it perfectly clear that 
shared is for documentation purposes and nothing else, or, figure out an 
alternative system to shared, because I don't see shared actually being 
useful for real world work no matter what we do with it.

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Nov 11 2012

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 11/11/12, Alex R=F8nne Petersen <alex lycus.org> wrote:
 And finally, the worst part of all of this? People writing code that
 uses shared today are blindly assuming it actually does the right thing.
 It doesn't.

I think most people probably don't even use shared due to lacking
Phobos support. E.g.
http://d.puremagic.com/issues/show_bug.cgi?id=3D7036

Not even using the write functions worked on shared types until 2.059
(e.g. printing shared arrays).

'shared' has this wonderfully attractive name to it, but apparently it
doesn't have much guarantees? E.g. Walter's comment here:
http://d.puremagic.com/issues/show_bug.cgi?id=3D8077#c1

So +1 from me just because I have no idea what shared is supposed to
guarantee. I've just stubbornly used __gshared variables because
std.concurrency.send() doesn't accept mutable data. send() doesn't
work with shared either, so I have no clue.. :)

Nov 11 2012

"Chris Nicholson-Sauls" <ibisbasenji gmail.com> writes:

On Sunday, 11 November 2012 at 19:28:30 UTC, Andrej Mitrovic 
wrote:
 On 11/11/12, Alex Rønne Petersen <alex lycus.org> wrote:
 And finally, the worst part of all of this? People writing 
 code that
 uses shared today are blindly assuming it actually does the 
 right thing.
 It doesn't.

 I think most people probably don't even use shared due to 
 lacking
 Phobos support. E.g.
 http://d.puremagic.com/issues/show_bug.cgi?id=7036

 Not even using the write functions worked on shared types until 
 2.059
 (e.g. printing shared arrays).

 'shared' has this wonderfully attractive name to it, but 
 apparently it
 doesn't have much guarantees? E.g. Walter's comment here:
 http://d.puremagic.com/issues/show_bug.cgi?id=8077#c1

 So +1 from me just because I have no idea what shared is 
 supposed to
 guarantee. I've just stubbornly used __gshared variables because
 std.concurrency.send() doesn't accept mutable data. send() 
 doesn't
 work with shared either, so I have no clue.. :)

Fix support for shared(T) in std.variant, and you will have fixed 
send() as well.  Meanwhile, in common cases a simple wrapper 
struct suffices.

module toy;
import std.concurrency, std.stdio;

struct SImpl {
     string s;
     int i;
}
alias shared( SImpl ) S;

struct Msg { S s; }
struct Quit {}

S global = S( "global", 999 );

void main () {
     auto child = spawn( &task );
     S s = S( "abc", 42 );
     child.send( Msg( s ) );
     child.send( Msg( global ) );
     child.send( Quit() );
}

void task () {
     bool sentinel = true;
     while ( sentinel ) {
         receive(
             ( Msg msg  ) { writeln( msg.s.s, " -- ", msg.s.i ); },
             ( Quit msg ) { sentinel = false; }
         );
     }
}


grant aesgard ~/Projects/D/foo/shared_test $ dmd toy && ./toy
abc -- 42
global -- 999

-- Chris Nicholson-Sauls

Nov 11 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Fully agree.

Kind Regards
Benjamin Thaut

Nov 11 2012

"martin" <kinke libero.it> writes:

On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:
 Fully agree.

+1

Nov 11 2012

Graham St Jack <Graham.StJack internode.on.net> writes:

On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:

 On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:
 Fully agree.

 
 +1

+1.

I find it so broken that I have to avoid using it in all but the most 
trivial situations.

Nov 11 2012

deadalnix <deadalnix gmail.com> writes:

Le 11/11/2012 23:36, Graham St Jack a écrit :
 On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:

 On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:
 Fully agree.

 +1

 +1.

 I find it so broken that I have to avoid using it in all but the most
 trivial situations.

That isn't a bad thing in itself.

Nov 11 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, November 12, 2012 01:17:06 deadalnix wrote:
 Le 11/11/2012 23:36, Graham St Jack a =C3=A9crit :
 On Sun, 11 Nov 2012 22:19:08 +0100, martin wrote:
 On Sunday, 11 November 2012 at 20:08:25 UTC, Benjamin Thaut wrote:=



 Fully agree.

=20
 +1

=20
 +1.
=20
 I find it so broken that I have to avoid using it in all but the mo=


st
 trivial situations.

=20
 That isn't a bad thing in itself.

I don' think that it's really intended that shared by 100% easy to use.=
 You're=20
_supposed_ to use it sparingly. But at this point, it borders on being =
utterly=20
unusable.

We have a bit of a problem with the basic idea though in that you're no=
t=20
supposed to be using shared much, and it's supposed to be segregated su=
ch that=20
having the shared equivalent of const (as in it works with both shared =
and=20
non-shared) would pose a big problem (it's also probably untenable with=
 memory=20
barriers and the like), but if you _don't_ have something like that, yo=
u=20
either can't use shared with much of anything, or you have to cast it a=
way all=20
over the place, which loses all of the memory barriers or whatnot. We h=
ave=20
conflicting requirements which aren't being managed very well.

I don't know how protected shared really needs to be though. Anything=20=

involving shared should make heavy use of mutexes and synchronized and =
whatnot=20
meaning that at least some of the protections that people want with sha=
red are=20
useless unless you're writing code which is being stupid and not using =
mutexes=20
or whatnot. So, casting away shared might not actually be that big a de=
al so=20
long as it's temporary to call a function (as opposed to stashing the v=
ariable=20
away somewhere) and that call is protected by a mutex or other thread-
protection mechanism.

At the moment, I think that the only way to make stuff work with both s=
hared=20
and unshared (aside from using lots of casts) is to make use of templat=
es, and=20
since most of druntime and Phobos isn't tested with shared, things like=
 Unqual=20
probably screw with that pretty thoroughly. It's at least conceivable t=
hough=20
that stuff like std.algorithm could work with shared just fine.

I don't think that there's much question though that shared is the majo=
r chink=20
in our armor with regards to thread-local by default. The basic idea is=
 great,=20
but the details still need some work.

- Jonathan M Davis

Nov 11 2012

"bearophile" <bearophileHUGS lycos.com> writes:

Alex Rønne Petersen:

 Something needs to be done about shared. I don't know what,

Maybe deprecate it and introduce something else that is rather 
different and based on thought-out theory?

Bye,
bearophile

Nov 11 2012

"nixda" <nd o.de> writes:

drop it in favour of :
http://forum.dlang.org/post/k7j1ta$2kv8$1 digitalmars.com


On Sunday, 11 November 2012 at 18:46:12 UTC, Alex Rønne Petersen 
wrote:
 Hi,

 It's starting to get outright embarrassing to talk to newcomers 
 about D's concurrency support because the most fundamental part 
 of it -- the shared type qualifier -- does not have 
 well-defined semantics at all.

 I'm certainly not alone in being annoyed by this state of 
 affairs: http://d.puremagic.com/issues/show_bug.cgi?id=8993

 I've posted rants about the state of shared before and, from 
 the comments on those, it appears that what most people want 
 shared to do is at least one (and usually multiple) of

 * make variables global (if appropriate in the context);
 * make the wrapped type completely separate from the unwrapped 
 type;
 * make all operations be atomic;
 * make all operations result in memory barriers.

 At a glance, this looks fine. Exactly what you would want for 
 shared types in a concurrent setting, right?

 Except, not really. I'll try to explain all of the unsolved 
 problems with shared below...

 First of all, the fact that shared(T) is completely separate 
 from T (i.e. no conversions allowed, except for primitive 
 types) is a huge usability problem. In practice, it means that 
 99% of the standard library is unusable with shared types. 
 Hell, even most of the runtime doesn't work with shared types. 
 I don't know how to best solve this particular problem; I'm 
 just pointing it out because anyone who tries to do anything 
 non-trivial with shared will invariably run into this.

 Second, the idea of making shared insert atomic operations is 
 an absolute fallacy. It only makes sense for primitive types 
 for the most part, and even for those, what sizes are supported 
 depends on the target architecture. A number of ideas have come 
 up to solve this problem:

 * We make shared(T) not compile for certain Ts depending on the 
 target architecture. I personally think this is a terrible idea 
 because most code using shared will not be portable at all.
 * We require any architecture D targets to support atomic 
 operations for a certain size S at the very least. This is fine 
 for primitives up to 64 bits in size, but doesn't clear up the 
 situation for larger types (real, complex types, cent/ucent, 
 ...).
 * We make shared not insert atomic operations at all (thus 
 making it kind of useless for anything but documentation).
 * (Possibly others I have forgotten; please let me know if this 
 is the case.)

 I don't think any of these are particularly attractive, to be 
 honest. If we do make shared insert atomic operations, we would 
 also have to consider the memory ordering of those operations.

 Third, we have memory barriers. I strongly suspect that this is 
 a misnomer in most cases where people have suggested this; it's 
 generally not useful to have a compiler insert barriers because 
 they are used to control ordering of load/store operations 
 which is something the programmer will want to do explicitly. 
 In any case, the compiler can't usefully figure out where to 
 put barriers, so it would just result in really bad performance 
 for no apparent gain.

 Fourth, there is implementation complexity. If shared is meant 
 to insert specialized instructions, it will result in 
 effectively two code paths for most code generation in any D 
 compiler (read: maintenance nightmare).

 Fifth, it is completely unclear whether casting to and from 
 shared is legal (but with a big fat "caution" sign like casting 
 away const) or if it's undefined behavior. Making it undefined 
 behavior would further increase the usability problem I 
 described above.

 And finally, the worst part of all of this? People writing code 
 that uses shared today are blindly assuming it actually does 
 the right thing. It doesn't. Their code will break on any 
 non-x86 platform. This is an absolutely horrifying situation 
 now that ARM, MIPS, and PowerPC are starting to become viable 
 targets for D.

 Something needs to be done about shared. I don't know what, but 
 the current situation is -- and I'm really not exaggerating 
 here -- laughable. I think we either need to just make it 
 perfectly clear that shared is for documentation purposes and 
 nothing else, or, figure out an alternative system to shared, 
 because I don't see shared actually being useful for real world 
 work no matter what we do with it.

Nov 11 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-11 18:46:10 +0000, Alex R�nne Petersen <alex lycus.org> said:

 Something needs to be done about shared. I don't know what, but the 
 current situation is -- and I'm really not exaggerating here -- 
 laughable. I think we either need to just make it perfectly clear that 
 shared is for documentation purposes and nothing else, or, figure out 
 an alternative system to shared, because I don't see shared actually 
 being useful for real world work no matter what we do with it.

I feel like the concurrency aspect of D2 was rushed in the haste of 
having it ready for TDPL. Shared, deadlock-prone synchronized 
classes[1] as well as destructors running in any thread (thanks GC!) 
plus a couple of other irritants makes the whole concurrency scheme 
completely flawed if you ask me. D2 needs a near complete overhaul on 
the concurrency front.

I'm currently working on a big code base in C++. While I do miss D when 
it comes to working with templates as well as for its compilation speed 
and a few other things, I can't say I miss D much when it comes to 
anything touching concurrency.

[1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 11 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/12/2012 02:48 AM, Michel Fortin wrote:
 On 2012-11-11 18:46:10 +0000, Alex R�nne Petersen <alex lycus.org> said:

 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out
 an alternative system to shared, because I don't see shared actually
 being useful for real world work no matter what we do with it.

 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency front.

 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.

 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/

I am always irritated by shared-by-default static variables.

Nov 13 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency front.
 
 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.
 
 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/

 
 I am always irritated by shared-by-default static variables.

I tend to have very little global state in my code, so 
shared-by-default is not something I have to fight with very often. I 
do agree that thread-local is a better default.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 13 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, November 13, 2012 22:12:12 Michel Fortin wrote:
 I tend to have very little global state in my code, so
 shared-by-default is not something I have to fight with very often. I
 do agree that thread-local is a better default.

Thread-local by default is a _huge_ step forward, and in hindsight, it seems 
pretty ridiculous that a language would do anything else. Shared by default is 
just too horrible.

- Jonathan M Davis

Nov 13 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/14/2012 04:12 AM, Michel Fortin wrote:
 On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency
 front.

 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.

 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/

 I am always irritated by shared-by-default static variables.

 I tend to have very little global state in my code,

So do I. A thread-local static variable does not imply global state. 
(The execution stack is static.) Eg. in a few cases it is sensible to 
use static variables as implicit arguments to avoid having to pass them 
around by copying them all over the execution stack.

private int x = 0;

int foo(){
     int xold = x;
     scope(exit) x = xold;
     x = new_value;
     bar(); // reads x
     return baz(); // reads x
}

Unfortunately, this destroys 'pure' even though it actually does not.

 so shared-by-default
 is not something I have to fight with very often.  I do agree that
 thread-local is a better default.

Nov 14 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/14/2012 04:12 AM, Michel Fortin wrote:
 On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:
 
 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized classes[1]
 as well as destructors running in any thread (thanks GC!) plus a couple
 of other irritants makes the whole concurrency scheme completely flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency
 front.
 
 I'm currently working on a big code base in C++. While I do miss D when
 it comes to working with templates as well as for its compilation speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.
 
 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/

 
 I am always irritated by shared-by-default static variables.

 
 I tend to have very little global state in my code,

 
 So do I. A thread-local static variable does not imply global state. 
 (The execution stack is static.) Eg. in a few cases it is sensible to 
 use static variables as implicit arguments to avoid having to pass them 
 around by copying them all over the execution stack.
 
 private int x = 0;
 
 int foo(){
      int xold = x;
      scope(exit) x = xold;
      x = new_value;
      bar(); // reads x
      return baz(); // reads x
 }

I'd consider that poor style. Use a struct to encapsulate the state, 
then make bar, and baz member functions of that struct. The resulting 
code is cleaner and easier to read:

pure int foo() {
	auto state = State(new_value);
	state.bar();
	return state.baz();
}

You could achieve something similar with nested functions too.


 Unfortunately, this destroys 'pure' even though it actually does not.

Using a local-scoped struct would work with pure, be more efficient 
(accessing thread-local variables takes more cycles), and be less 
error-prone while refactoring.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 14 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/14/2012 01:42 PM, Michel Fortin wrote:
 On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/14/2012 04:12 AM, Michel Fortin wrote:
 On 2012-11-13 19:54:32 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/12/2012 02:48 AM, Michel Fortin wrote:
 I feel like the concurrency aspect of D2 was rushed in the haste of
 having it ready for TDPL. Shared, deadlock-prone synchronized
 classes[1]
 as well as destructors running in any thread (thanks GC!) plus a
 couple
 of other irritants makes the whole concurrency scheme completely
 flawed
 if you ask me. D2 needs a near complete overhaul on the concurrency
 front.

 I'm currently working on a big code base in C++. While I do miss D
 when
 it comes to working with templates as well as for its compilation
 speed
 and a few other things, I can't say I miss D much when it comes to
 anything touching concurrency.

 [1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/

 I am always irritated by shared-by-default static variables.

 I tend to have very little global state in my code,

 So do I. A thread-local static variable does not imply global state.
 (The execution stack is static.) Eg. in a few cases it is sensible to
 use static variables as implicit arguments to avoid having to pass
 them around by copying them all over the execution stack.

 private int x = 0;

 int foo(){
      int xold = x;
      scope(exit) x = xold;
      x = new_value;
      bar(); // reads x
      return baz(); // reads x
 }

 I'd consider that poor style.

I'd consider this a poor statement to make. Universally quantified 
assertions require more rigorous justification.

"In a few cases" it is not, even if it is poor style "most of the time".

 Use a struct to encapsulate the state, then make bar, and baz member functions
of that struct.

They could eg. be virtual member functions of a class already.

 Using a local-scoped struct would work with pure,

It might.

 be more efficient

Not necessarily.

 (accessing thread-local variables takes more cycles),

It can be accessed sparsely, copying around the struct pointer is work 
too, and the fastest access path in a proper alternative design would 
potentially be even slower.

 and be less error-prone while refactoring.

If done in such a way that it makes refactoring error prone, it is to be 
considered poor style.

Nov 14 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-14 14:30:19 +0000, Timon Gehr <timon.gehr gmx.ch> said:

 On 11/14/2012 01:42 PM, Michel Fortin wrote:
 On 2012-11-14 10:30:46 +0000, Timon Gehr <timon.gehr gmx.ch> said:
 
 So do I. A thread-local static variable does not imply global state.
 (The execution stack is static.) Eg. in a few cases it is sensible to
 use static variables as implicit arguments to avoid having to pass
 them around by copying them all over the execution stack.
 
 private int x = 0;
 
 int foo(){
      int xold = x;
      scope(exit) x = xold;
      x = new_value;
      bar(); // reads x
      return baz(); // reads x
 }

 
 I'd consider that poor style.

 
 I'd consider this a poor statement to make. Universally quantified 
 assertions require more rigorous justification.

Indeed. There's not enough context to judge fairly. I can accept the 
idea there are situations where it is really inconvenient or impossible 
to pass the state as an argument.

That said, I disagree that this is not using global state. It might not 
be globally accessible (because x is private), but the state still 
exists globally since variable x exists in all threads irrespective of 
whether they use foo or not.


 If done in such a way that it makes refactoring error prone, it is to 
 be considered poor style.

I guess we agree.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the shared type
 qualifier -- does not have well-defined semantics at all.

I think a couple things are clear:

1. Slapping shared on a type is never going to make algorithms on that type
work 
in a concurrent context, regardless of what is done with memory barriers.
Memory 
barriers ensure sequential consistency, they do nothing for race conditions
that 
are sequentially consistent. Remember, single core CPUs are all sequentially 
consistent, and still have major concurrency problems. This also means that 
having templates accept shared(T) as arguments and have them magically generate 
correct concurrent code is a pipe dream.

2. The idea of shared adding memory barriers for access is not going to ever 
work. Adding barriers has to be done by someone who knows what they're doing
for 
that particular use case, and the compiler inserting them is not going to 
substitute.


However, and this is a big however, having shared as compiler-enforced 
self-documentation is immensely useful. It flags where and when data is being 
shared. So, your algorithm won't compile when you pass it a shared type? That
is 
because it is NEVER GOING TO WORK with a shared type. At least you get a
compile 
time indication of this, rather than random runtime corruption.

To make a shared type work in an algorithm, you have to:

1. ensure single threaded access by aquiring a mutex
2. cast away shared
3. operate on the data
4. cast back to shared
5. release the mutex

Also, all op= need to be disabled for shared types.

Nov 11 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

The only problem beeing that you can not really have user defined shared 
(value) types:

http://d.puremagic.com/issues/show_bug.cgi?id=8295

Kind Regards
Benjamin Thaut

Nov 11 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/11/2012 10:05 PM, Benjamin Thaut wrote:
 The only problem beeing that you can not really have user defined shared
(value)
 types:

 http://d.puremagic.com/issues/show_bug.cgi?id=8295

If you include an object designed to work only in a single thread (non-shared), 
make it shared, and then destruct it when other threads may be pointing to it
...

What should happen?

Nov 11 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 12.11.2012 07:50, schrieb Walter Bright:
 On 11/11/2012 10:05 PM, Benjamin Thaut wrote:
 The only problem beeing that you can not really have user defined
 shared (value)
 types:

 http://d.puremagic.com/issues/show_bug.cgi?id=8295

 If you include an object designed to work only in a single thread
 (non-shared), make it shared, and then destruct it when other threads
 may be pointing to it ...

 What should happen?

I'm not talking about objects, I'm talking about value types.
And you can't make it work at all. If you do

   shared ~this()
   {
     buf = null;
   }

it won't work either. You don't have _any_ option to destroy a shared 
struct.

Kind Regards
Benjamin Thaut

Nov 12 2012

Johannes Pfau <nospam example.com> writes:

Am Sun, 11 Nov 2012 18:30:17 -0800
schrieb Walter Bright <newshound2 digitalmars.com>:

 
 To make a shared type work in an algorithm, you have to:
 
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
 
 Also, all op= need to be disabled for shared types.

But there are also shared member functions and they're kind of annoying
right now:

* You can't call shared methods from non-shared methods or vice versa.
  This leads to code duplication, you basically have to implement
  everything twice:

----------
struct ABC
{
        Mutext mutex;
	void a()
	{
		aImpl();
	}
	shared void a()
	{
		synchronized(mutex)
		    aImpl();  //not allowed
	}
	private void aImpl()
	{
		
	}
}
----------
The only way to avoid this is casting away shared in the shared a
method, but that really is annoying.

* You can't have data members be included only for the shared version.
  In the above example, the mutex member will always be included, even
  if ABC instance is thread local.

So you're often better off writing a non-thread safe struct and writing
a wrapper struct. This way you don't have useless overhead in the
non-thread safe implementation. But the nice instance syntax is
lost:

shared(ABC) abc1; ABC abc2;
vs
SharedABC abc1; ABC abc2; 

even worse, shared propagation won't work this way;

struct DEF
{
    ABC abc;
}
shared(DEF) def;
def.abc.a();



and then there's also the druntime issue: core.sync doesn't work with
shared which leads to this schizophrenic situation:
struct A
{
    Mutex m;
    void a() //Doesn't compile with shared
    {
        m.lock();  //Compiles, but locks on a TLS mutex!
        m.unlock();
    }
}

struct A
{
    shared Mutex m;
    shared void a()
    {
        m.lock();  //Doesn't compile
        (cast(Mutex)m).unlock(); //Ugly
    }
}

So the only useful solution avoids using shared:
struct A
{
    __gshared Mutex m; //Good we have __gshared!
    shared void a()
    {
        m.lock();
        m.unlock();
    }
}



And then there are some open questions with advanced use cases:
* How do I make sure that a non-shared delegate is only accepted if I
  have an A, but a shared delegate should be supported
  for shared(A) and A? (calling a shared delegate from a non-shared
  function should work, right?)

struct A
{
    void a(T)(T v)
    {
        writeln("non-shared");
    }
    shared void a(T)(T v)  if (isShared!v) //isShared doesn't exist
    {
        writeln("shared");
    }
}

And having fun with this little example:
http://dpaste.dzfl.pl/7f6a4ad2

* What's the difference between: "void delegate() shared"
  and "shared(void delegate())"?

Error: cannot implicitly convert expression (&a.abc) of type void
delegate() shared to shared(void delegate())

* So let's call it void delegate() shared instead:
void incrementA(void delegate() shared del)
/home/c684/c922.d(7): Error: const/immutable/shared/inout attributes
  are only valid for non-static member functions

Nov 12 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/12/2012 2:57 AM, Johannes Pfau wrote:
 But there are also shared member functions and they're kind of annoying
 right now:

 * You can't call shared methods from non-shared methods or vice versa.
    This leads to code duplication, you basically have to implement
    everything twice:

You can't get away from the fact that data that can be accessed from multiple 
threads has to be dealt with in a *fundamentally* different way than single 
threaded code. You cannot share code between the two. There is simply no 
conceivable way that "share" can be added and then code will become thread safe.

Most of the issues you're having seem to revolve around treating shared data 
access just like single threaded access, except "share" was added. This cannot 
work. The compiler error messages, while very annoying, are in their own
obscure 
way pointing this out.

It's my fault, I have not explained share very well, and have oversold it. It 
does not solve concurrency problems, it points them out.

 ----------
 struct ABC
 {
          Mutext mutex;
 	void a()
 	{
 		aImpl();
 	}
 	shared void a()
 	{
 		synchronized(mutex)
 		    aImpl();  //not allowed
 	}
 	private void aImpl()
 	{
 		
 	}
 }
 ----------
 The only way to avoid this is casting away shared in the shared a
 method, but that really is annoying.

As I explained, the way to manipulate shared data is to get exclusive access to 
it via a mutex, cast away the shared-ness, manipulate it as single threaded 
data, convert it back to shared, and release the mutex.


 * You can't have data members be included only for the shared version.
    In the above example, the mutex member will always be included, even
    if ABC instance is thread local.

 So you're often better off writing a non-thread safe struct and writing
 a wrapper struct. This way you don't have useless overhead in the
 non-thread safe implementation. But the nice instance syntax is
 lost:

 shared(ABC) abc1; ABC abc2;
 vs
 SharedABC abc1; ABC abc2;

 even worse, shared propagation won't work this way;

 struct DEF
 {
      ABC abc;
 }
 shared(DEF) def;
 def.abc.a();



 and then there's also the druntime issue: core.sync doesn't work with
 shared which leads to this schizophrenic situation:
 struct A
 {
      Mutex m;
      void a() //Doesn't compile with shared
      {
          m.lock();  //Compiles, but locks on a TLS mutex!
          m.unlock();
      }
 }

 struct A
 {
      shared Mutex m;
      shared void a()
      {
          m.lock();  //Doesn't compile
          (cast(Mutex)m).unlock(); //Ugly
      }
 }

 So the only useful solution avoids using shared:
 struct A
 {
      __gshared Mutex m; //Good we have __gshared!
      shared void a()
      {
          m.lock();
          m.unlock();
      }
 }

Yes, mutexes will need to exist in a global space.

 And then there are some open questions with advanced use cases:
 * How do I make sure that a non-shared delegate is only accepted if I
    have an A, but a shared delegate should be supported
    for shared(A) and A? (calling a shared delegate from a non-shared
    function should work, right?)

 struct A
 {
      void a(T)(T v)
      {
          writeln("non-shared");
      }
      shared void a(T)(T v)  if (isShared!v) //isShared doesn't exist
      {
          writeln("shared");
      }
 }

First, you have to decide what you mean by a shared delegate. Do you mean the 
variable containing the two pointers that make up a delegate are shared, or the 
delegate is supposed to deal with shared data?


 And having fun with this little example:
 http://dpaste.dzfl.pl/7f6a4ad2

 * What's the difference between: "void delegate() shared"
    and "shared(void delegate())"?

 Error: cannot implicitly convert expression (&a.abc) of type void
 delegate() shared

The delegate deals with shared data.

 to shared(void delegate())

The variable holding the delegate is shared.


 * So let's call it void delegate() shared instead:
 void incrementA(void delegate() shared del)
 /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes
    are only valid for non-static member functions

Nov 12 2012

luka8088 <luka8088 owave.net> writes:

If I understood correctly there is no reason why this should not compile ?

import core.sync.mutex;

class MyClass {
   void method () {}
}

void main () {
   auto myObject = new shared(MyClass);
   synchronized (myObject) {
     myObject.method();
   }
}


On 12.11.2012 12:19, Walter Bright wrote:
 On 11/12/2012 2:57 AM, Johannes Pfau wrote:
 But there are also shared member functions and they're kind of annoying
 right now:

 * You can't call shared methods from non-shared methods or vice versa.
 This leads to code duplication, you basically have to implement
 everything twice:

 You can't get away from the fact that data that can be accessed from
 multiple threads has to be dealt with in a *fundamentally* different way
 than single threaded code. You cannot share code between the two. There
 is simply no conceivable way that "share" can be added and then code
 will become thread safe.

 Most of the issues you're having seem to revolve around treating shared
 data access just like single threaded access, except "share" was added.
 This cannot work. The compiler error messages, while very annoying, are
 in their own obscure way pointing this out.

 It's my fault, I have not explained share very well, and have oversold
 it. It does not solve concurrency problems, it points them out.

 ----------
 struct ABC
 {
 Mutext mutex;
 void a()
 {
 aImpl();
 }
 shared void a()
 {
 synchronized(mutex)
 aImpl(); //not allowed
 }
 private void aImpl()
 {

 }
 }
 ----------
 The only way to avoid this is casting away shared in the shared a
 method, but that really is annoying.

 As I explained, the way to manipulate shared data is to get exclusive
 access to it via a mutex, cast away the shared-ness, manipulate it as
 single threaded data, convert it back to shared, and release the mutex.


 * You can't have data members be included only for the shared version.
 In the above example, the mutex member will always be included, even
 if ABC instance is thread local.

 So you're often better off writing a non-thread safe struct and writing
 a wrapper struct. This way you don't have useless overhead in the
 non-thread safe implementation. But the nice instance syntax is
 lost:

 shared(ABC) abc1; ABC abc2;
 vs
 SharedABC abc1; ABC abc2;

 even worse, shared propagation won't work this way;

 struct DEF
 {
 ABC abc;
 }
 shared(DEF) def;
 def.abc.a();



 and then there's also the druntime issue: core.sync doesn't work with
 shared which leads to this schizophrenic situation:
 struct A
 {
 Mutex m;
 void a() //Doesn't compile with shared
 {
 m.lock(); //Compiles, but locks on a TLS mutex!
 m.unlock();
 }
 }

 struct A
 {
 shared Mutex m;
 shared void a()
 {
 m.lock(); //Doesn't compile
 (cast(Mutex)m).unlock(); //Ugly
 }
 }

 So the only useful solution avoids using shared:
 struct A
 {
 __gshared Mutex m; //Good we have __gshared!
 shared void a()
 {
 m.lock();
 m.unlock();
 }
 }

 Yes, mutexes will need to exist in a global space.

 And then there are some open questions with advanced use cases:
 * How do I make sure that a non-shared delegate is only accepted if I
 have an A, but a shared delegate should be supported
 for shared(A) and A? (calling a shared delegate from a non-shared
 function should work, right?)

 struct A
 {
 void a(T)(T v)
 {
 writeln("non-shared");
 }
 shared void a(T)(T v) if (isShared!v) //isShared doesn't exist
 {
 writeln("shared");
 }
 }

 First, you have to decide what you mean by a shared delegate. Do you
 mean the variable containing the two pointers that make up a delegate
 are shared, or the delegate is supposed to deal with shared data?


 And having fun with this little example:
 http://dpaste.dzfl.pl/7f6a4ad2

 * What's the difference between: "void delegate() shared"
 and "shared(void delegate())"?

 Error: cannot implicitly convert expression (&a.abc) of type void
 delegate() shared

 The delegate deals with shared data.

 to shared(void delegate())

 The variable holding the delegate is shared.


 * So let's call it void delegate() shared instead:
 void incrementA(void delegate() shared del)
 /home/c684/c922.d(7): Error: const/immutable/shared/inout attributes
 are only valid for non-static member functions

Nov 12 2012

deadalnix <deadalnix gmail.com> writes:

Le 12/11/2012 16:00, luka8088 a écrit :
 If I understood correctly there is no reason why this should not compile ?

 import core.sync.mutex;

 class MyClass {
 void method () {}
 }

 void main () {
 auto myObject = new shared(MyClass);
 synchronized (myObject) {
 myObject.method();
 }
 }

D has no ownership, so the compiler can't know what
if it is safe to do so or not.

Nov 12 2012

luka8088 <luka8088 owave.net> writes:

Here i as wild idea:

//////////

void main () {

   mutex x;
   // mutex is not a type but rather a keyword
   // x is a symbol in order to allow
   // different x in different scopes

   shared(x) int i;
   // ... or maybe use UDA ?
   // mutex x must be locked
   // in order to change i

   synchronized (x) {
     // lock x in a compiler-aware way
     i++;
     // compiler guarantees that i will not
     // be changed outside synchronized(x)
   }

}

//////////

so I tried something similar with current implementation:

//////////

import std.stdio;

void main () {

   shared(int) i1;
   auto m1 = new MyMutex();

   i1.attachMutex(m1);
   // m1 must be locked in order to modify i1
	
   // i1++;
   // should throw a compiler error

   // sharedAccess(i1)++;
   // runtime exception, m1 is not locked

   synchronized (m1) {
     sharedAccess(i1)++;
     // ok, m1 is locked
   }

}

// some generic code

import core.sync.mutex;

class MyMutex : Mutex {
    property bool locked = false;
    trusted void lock () {
     super.lock();
     locked = true;
   }
    trusted void unlock () {
     locked = false;
     super.unlock();
   }
   bool tryLock () {
     bool result = super.tryLock();
     if (result)
       locked = true;
     return result;
   }
}

template unshared (T : shared(T)) {
   alias T unshared;
}

template unshared (T : shared(T)*) {
   alias T* unshared;
}

auto ref sharedAccess (T) (ref T value) {
   assert(value.attachMutex().locked);
   unshared!(T)* refVal = (cast(unshared!(T*)) &value);
   return *refVal;
}

MyMutex attachMutex (T) (T value, MyMutex mutex = null) {
   static __gshared MyMutex[T] mutexes;
   // this memory leak can be solved
   // but it's left like this to make the code simple
   synchronized if (value !in mutexes && mutex !is null)
     mutexes[value] = mutex;
   assert(mutexes[value] !is null);
   return mutexes[value];
}

//////////

and another example with methods:

//////////

import std.stdio;

class a {
   int i;
   void increment () { i++; }
}

void main () {

   auto a1 = new shared(a);
   auto m1 = new MyMutex();

   a1.attachMutex(m1);
   // m1 must be locked in order to modify a1
	
   // a1.increment();
   // compiler error

   // sharedAccess(a1).increment();
   // runtime exception, m1 is not locked

   synchronized (m1) {
     sharedAccess(a1).increment();
     // ok, m1 is locked
   }

}

// some generic code

import core.sync.mutex;

class MyMutex : Mutex {
    property bool locked = false;
    trusted void lock () {
     super.lock();
     locked = true;
   }
    trusted void unlock () {
     locked = false;
     super.unlock();
   }
   bool tryLock () {
     bool result = super.tryLock();
     if (result)
       locked = true;
     return result;
   }
}

template unshared (T : shared(T)) {
   alias T unshared;
}

template unshared (T : shared(T)*) {
   alias T* unshared;
}

auto ref sharedAccess (T) (ref T value) {
   assert(value.attachMutex().locked);
   unshared!(T)* refVal = (cast(unshared!(T*)) &value);
   return *refVal;
}

MyMutex attachMutex (T) (T value, MyMutex mutex = null) {
   static __gshared MyMutex[T] mutexes;
   // this memory leak can be solved
   // but it's left like this to make the code simple
   synchronized if (value !in mutexes && mutex !is null)
     mutexes[value] = mutex;
   assert(mutexes[value] !is null);
   return mutexes[value];
}

//////////

In any case, if shared itself does not provide locking and does not 
fixes problems but only points them out (not to be misunderstood, I 
completely agree with that) then I think that assigning a mutex to the 
variable is a must.

Aldo latter examples already work with current implementation I like the 
first one (or something similar to the first one) more, it looks cleaner 
and leaves space for additional optimizations.


On 12.11.2012 17:14, deadalnix wrote:
 Le 12/11/2012 16:00, luka8088 a écrit :
 If I understood correctly there is no reason why this should not
 compile ?

 import core.sync.mutex;

 class MyClass {
 void method () {}
 }

 void main () {
 auto myObject = new shared(MyClass);
 synchronized (myObject) {
 myObject.method();
 }
 }

 D has no ownership, so the compiler can't know what
 if it is safe to do so or not.

Nov 12 2012

"Johannes Pfau" <nospam example.com> writes:

On Monday, 12 November 2012 at 11:19:57 UTC, Walter Bright wrote:
 On 11/12/2012 2:57 AM, Johannes Pfau wrote:
 But there are also shared member functions and they're kind of 
 annoying
 right now:

 * You can't call shared methods from non-shared methods or 
 vice versa.
   This leads to code duplication, you basically have to 
 implement
   everything twice:

 You can't get away from the fact that data that can be accessed 
 from multiple threads has to be dealt with in a *fundamentally* 
 different way than single threaded code. You cannot share code 
 between the two. There is simply no conceivable way that 
 "share" can be added and then code will become thread safe.

I know share can't automatically make the code thread safe. I
just wanted to point out that this casting / code duplication is
annoying but I don't know either how this could be solved.


 Yes, mutexes will need to exist in a global space.

I'm not sure if I undestand this. Don't you think shared(Mutex)
should work?
AFAICS that's only a library problem: Add shared to the lock /
unlock methods in druntime and it should work?

Or global as in not in the struct instance?

 And then there are some open questions with advanced use cases:
 * How do I make sure that a non-shared delegate is only 
 accepted if I
   have an A, but a shared delegate should be supported
   for shared(A) and A? (calling a shared delegate from a 
 non-shared
   function should work, right?)

 struct A
 {
     void a(T)(T v)
     {
         writeln("non-shared");
     }
     shared void a(T)(T v)  if (isShared!v) //isShared doesn't 
 exist
     {
         writeln("shared");
     }
 }

 First, you have to decide what you mean by a shared delegate. 
 Do you mean the variable containing the two pointers that make 
 up a delegate are shared, or the delegate is supposed to deal 
 with shared data?

I'm talking about a delegate pointing to a method declared with
the "shared" keyword and the "this pointer" pointing to a shared
object:
struct A
{
     shared void a(){}
}
shared A instance;
auto del = &instance.a; //I'm talking about this type

To explain that usecase: I think of a shared delegate as a
delegate that can be safely called from different threads. So I
can store it in a struct instance and later on call it from any
thread:

struct Signal
{
      //The variable is shared _AND_ the method is shared
      shared(shared void delegate()) _handler;

      shared void call() //Can be called from any thread
      {
          //Would have to synchronize access to the variable in a
real world case,
          //but the call itself wouldn't have to be synchronized
          shared void delegate() localHandler;
          synchronized(mutex)
          {
              localHandler = _handler;
          }
          localHandler ();
      }
}

 And having fun with this little example:
 http://dpaste.dzfl.pl/7f6a4ad2

 * What's the difference between: "void delegate() shared"
   and "shared(void delegate())"?

 Error: cannot implicitly convert expression (&a.abc) of type 
 void
 delegate() shared

 The delegate deals with shared data.

OK so that's what I need but the compiler doesn't let me declare
that type.

alias void delegate() shared del;
Error: const/immutable/shared/inout attributes are only valid for
non-static member functions

 to shared(void delegate())

 The variable holding the delegate is shared.

OK, but when it's used as a function parameter, which is
pass-by-value for delegates and because of tail-shared there's
effectively no difference, right? In that case it's not possible
to pass a shared variable to the function as this will always
create a copy?

void abcd(shared(void delegate()) del)
which is the same as
void abcd(shared void delegate() del)

How would you pass del as a shared variable?

 * So let's call it void delegate() shared instead:
 void incrementA(void delegate() shared del)
 /home/c684/c922.d(7): Error: const/immutable/shared/inout 
 attributes
   are only valid for non-static member functions

Nov 12 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 12, 2012, at 2:57 AM, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 11 Nov 2012 18:30:17 -0800
 schrieb Walter Bright <newshound2 digitalmars.com>:
=20
=20
 To make a shared type work in an algorithm, you have to:
=20
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
=20
 Also, all op=3D need to be disabled for shared types.

=20
 But there are also shared member functions and they're kind of =

annoying
 right now:
=20
 * You can't call shared methods from non-shared methods or vice versa.
  This leads to code duplication, you basically have to implement
  everything twice:
=20
 ----------
 struct ABC
 {
        Mutext mutex;
 	void a()
 	{
 		aImpl();
 	}
 	shared void a()
 	{
 		synchronized(mutex)
 		    aImpl();  //not allowed
 	}
 	private void aImpl()
 	{
 	=09
 	}
 }
 ----------
 The only way to avoid this is casting away shared in the shared a
 method, but that really is annoying.

Yes.  You end up having two methods for each function, one as a =
synchronized wrapper that casts away shared and another that does the =
actual work.


 and then there's also the druntime issue: core.sync doesn't work with
 shared which leads to this schizophrenic situation:
 struct A
 {
    Mutex m;
    void a() //Doesn't compile with shared
    {
        m.lock();  //Compiles, but locks on a TLS mutex!
        m.unlock();
    }
 }

Most of the reason for this was that I didn't like the old implications =
of shared, which was that shared methods would at some time in the =
future end up with memory barriers all over the place.  That's been =
dropped, but I'm still not a fan of the wrapper method for each =
function.  It makes for a crappy class design.=

Nov 14 2012

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright  
<newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

So what we actually want, in order to make the above "nice" is a "scoped"  
struct wrapping the mutex and shared object which does all the "dirty"  
work for you.  I'm thinking..

// (0)
with(ScopedLock(obj,lock))  // (1)
{
   obj.foo = 2;              // (2)
}                           // (3)
// (4)

(0) obj is a "shared" reference, lock is a global mutex
(1) mutex is acquired here, shared is cast away
(2) 'obj' is not "shared" here so data access is allowed
(3) ScopedLock is "destroyed" and the mutex released
(4) obj is shared again

I think most of the above can be done without any compiler support but it  
would be "nice" if the compiler did something clever with 'obj' such that  
it knew it wasn't 'shared' inside the the 'with' above.  If not, if a full  
library solution is desired we could always have another temporary  
"unshared" variable referencing obj.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Nov 12 2012

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 12 Nov 2012 11:55:51 -0000, Regan Heath <regan netmail.co.nz>  
wrote:
 On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright  
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 So what we actually want, in order to make the above "nice" is a  
 "scoped" struct wrapping the mutex and shared object which does all the  
 "dirty" work for you.  I'm thinking..

 // (0)
 with(ScopedLock(obj,lock))  // (1)
 {
    obj.foo = 2;              // (2)
 }                           // (3)
 // (4)

 (0) obj is a "shared" reference, lock is a global mutex
 (1) mutex is acquired here, shared is cast away
 (2) 'obj' is not "shared" here so data access is allowed
 (3) ScopedLock is "destroyed" and the mutex released
 (4) obj is shared again

 I think most of the above can be done without any compiler support but  
 it would be "nice" if the compiler did something clever with 'obj' such  
 that it knew it wasn't 'shared' inside the the 'with' above.  If not, if  
 a full library solution is desired we could always have another  
 temporary "unshared" variable referencing obj.

There was talk a while back about how to handle the existing object mutex  
and synchronized{} statement blocks and this subject has me thinking back  
to that.  My thinking has gone full circle and rather than bore you with  
all the details I want to present a conclusion which I am hoping is both  
implementable and useful.

First off, IIRC object contains a mutex/monitor/critical section, which  
means all objects contain one.  The last discussion saw many people  
wanting this removed for efficiency.  I propose we do this.  Then, if a  
class or struct is declared as "shared" or a "shared" instance of a class  
or struct is constructed we magically include one (compiler magic which I  
hope is possible).

Secondly I say we make "shared" illegal on basic types.  This is a  
limitation(*) but I believe in most cases a single int is unlikely to be  
shared without an accompanying group of other variables, and usually an  
algorithm operating on those variables.  These variables and the algorithm  
should be encapsulated in a class or struct - which can in turn be shared.

Now.. the synchronized() {} statement can do the magic described above (as  
ScopedLock) for us.  It would be illegal to call it on a non "shared"  
instance.  It would acquire the mutex and cast away "shared" inside the  
block/scope, at the end of the scope it would cast shared back and release  
the mutex.

(*) for those rare cases where a single int or other basic type is all  
that is shared we can provide a wrapper struct which is declared as  
"shared".

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Nov 12 2012

deadalnix <deadalnix gmail.com> writes:

Le 12/11/2012 13:25, Regan Heath a �crit :
 First off, IIRC object contains a mutex/monitor/critical section, which
 means all objects contain one. The last discussion saw many people
 wanting this removed for efficiency. I propose we do this. Then, if a
 class or struct is declared as "shared" or a "shared" instance of a
 class or struct is constructed we magically include one (compiler magic
 which I hope is possible).

As already explain in the thread you mention, it is not gonna work. The 
conclusion of the thread is that only synchronized classes should have 
one mutex field.

 Secondly I say we make "shared" illegal on basic types. This is a
 limitation(*) but I believe in most cases a single int is unlikely to be
 shared without an accompanying group of other variables, and usually an
 algorithm operating on those variables. These variables and the
 algorithm should be encapsulated in a class or struct - which can in
 turn be shared.

Shared reference counting ? Disruptor ?

 Now.. the synchronized() {} statement can do the magic described above
 (as ScopedLock) for us. It would be illegal to call it on a non "shared"
 instance. It would acquire the mutex and cast away "shared" inside the
 block/scope, at the end of the scope it would cast shared back and
 release the mutex.

 (*) for those rare cases where a single int or other basic type is all
 that is shared we can provide a wrapper struct which is declared as
 "shared".

Nov 12 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-12 12:55, Regan Heath wrote:
 On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 So what we actually want, in order to make the above "nice" is a
 "scoped" struct wrapping the mutex and shared object which does all the
 "dirty" work for you.  I'm thinking..

 // (0)
 with(ScopedLock(obj,lock))  // (1)
 {
    obj.foo = 2;              // (2)
 }                           // (3)
 // (4)

 (0) obj is a "shared" reference, lock is a global mutex
 (1) mutex is acquired here, shared is cast away
 (2) 'obj' is not "shared" here so data access is allowed
 (3) ScopedLock is "destroyed" and the mutex released
 (4) obj is shared again

 I think most of the above can be done without any compiler support but
 it would be "nice" if the compiler did something clever with 'obj' such
 that it knew it wasn't 'shared' inside the the 'with' above.  If not, if
 a full library solution is desired we could always have another
 temporary "unshared" variable referencing obj.

I'm just throwing it in here again, AST macros could probably solve this.

-- 
/Jacob Carlborg

Nov 12 2012

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On 2012-11-12, 15:11, Jacob Carlborg wrote:

 On 2012-11-12 12:55, Regan Heath wrote:
 On Mon, 12 Nov 2012 02:30:17 -0000, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 So what we actually want, in order to make the above "nice" is a
 "scoped" struct wrapping the mutex and shared object which does all the
 "dirty" work for you.  I'm thinking..

 // (0)
 with(ScopedLock(obj,lock))  // (1)
 {
    obj.foo = 2;              // (2)
 }                           // (3)
 // (4)

 (0) obj is a "shared" reference, lock is a global mutex
 (1) mutex is acquired here, shared is cast away
 (2) 'obj' is not "shared" here so data access is allowed
 (3) ScopedLock is "destroyed" and the mutex released
 (4) obj is shared again

 I think most of the above can be done without any compiler support but
 it would be "nice" if the compiler did something clever with 'obj' such
 that it knew it wasn't 'shared' inside the the 'with' above.  If not, if
 a full library solution is desired we could always have another
 temporary "unshared" variable referencing obj.

 I'm just throwing it in here again, AST macros could probably solve this.

Until someone writes a proper DIP on them, macros can write entire software
packages, download Hitler, turn D into lisp, and bake bread. Can we please
stop with the 'macros could do that' until there's any sort of consensus as
to what macros *could* do?

-- 
Simen

Nov 12 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-12 17:57, Simen Kjaeraas wrote:

 Until someone writes a proper DIP on them, macros can write entire software
 packages, download Hitler, turn D into lisp, and bake bread. Can we please
 stop with the 'macros could do that' until there's any sort of consensus as
 to what macros *could* do?

Sure, I can try and stop doing that :)

-- 
/Jacob Carlborg

Nov 12 2012

FeepingCreature <default_357-line yahoo.de> writes:

On 11/12/12 20:08, Jacob Carlborg wrote:
 On 2012-11-12 17:57, Simen Kjaeraas wrote:
 
 Until someone writes a proper DIP on them, macros can write entire software
 packages, download Hitler, turn D into lisp, and bake bread. Can we please
 stop with the 'macros could do that' until there's any sort of consensus as
 to what macros *could* do?

 
 Sure, I can try and stop doing that :)
 

You know, AST macros could probably stop doing that.

Food for thought.

Nov 13 2012

deadalnix <deadalnix gmail.com> writes:

Le 12/11/2012 03:30, Walter Bright a écrit :
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms on that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they do
 nothing for race conditions that are sequentially consistent. Remember,
 single core CPUs are all sequentially consistent, and still have major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler inserting
 them is not going to substitute.

The compiler is able to do some optimization on that, and, it never 
forget to put a barrier where I would.

Some algorithms are safe to use concurrently, granted the right barriers 
are in place. Think double check locking for instance.

This is the very reason why volatile have been modified in Java 1.5 to 
include barriers. I wish D's shared get a semantic close to java's volatile.

 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random
 runtime corruption.

Agreed.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.

That is never gonna scale without some kind of ownership of data. Think 
about slices.

Nov 12 2012

Manu <turkeyman gmail.com> writes:

On 12 November 2012 04:30, Walter Bright <newshound2 digitalmars.com> wrote=
:

 On 11/11/2012 10:46 AM, Alex R=C3=B8nne Petersen wrote:

 It's starting to get outright embarrassing to talk to newcomers about D'=


s
 concurrency support because the most fundamental part of it -- the share=


d
 type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms on that
 type work in a concurrent context, regardless of what is done with memory
 barriers. Memory barriers ensure sequential consistency, they do nothing
 for race conditions that are sequentially consistent. Remember, single co=

re
 CPUs are all sequentially consistent, and still have major concurrency
 problems. This also means that having templates accept shared(T) as
 arguments and have them magically generate correct concurrent code is a
 pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what they'=

re
 doing for that particular use case, and the compiler inserting them is no=

t
 going to substitute.


 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random runti=

me
 corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op=3D need to be disabled for shared types.

I agree completely the OP, shared is really very unhelpful right now. It
just inconveniences you, and forces you to perform explicit casts (which
may cast away other attributes like const).
I've thought before that what it might be useful+practical for shared to
do, is offer convenient methods to implement precisely what you describe
above.

Imagine a system where tagging a variable 'shared' would cause it to gain
some properties:
Gain a mutex, implicit var.lock()/release() methods to call on either side
of access to your shared variable, and unlike the current situation where
assignment is illegal, rather, assignment works as usual, but the shared
tag implies a runtime check to verify the item is locked when performing
assignment (perhaps that runtime check would be removed in -release for
performance).

This would make implementing the logic you describe above convenient, and
you wouldn't need to be declaring explicit mutexes around the place. It
would also address the safety by asserting that it is locked whenever
accessed.

Nov 12 2012

luka8088 <luka8088 owave.net> writes:

On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms on that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they do
 nothing for race conditions that are sequentially consistent. Remember,
 single core CPUs are all sequentially consistent, and still have major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler inserting
 them is not going to substitute.


 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random
 runtime corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.


This clarifies a lot, but still a lot of people get confused with:
http://dlang.org/faq.html#shared_memory_barriers
is it a faq error ?

and also with http://dlang.org/faq.html#shared_guarantees said, I come 
to think that the fact that the following code compiles is either lack 
of implementation, a compiler bug or a faq error ?

//////////

import core.thread;

void main () {
   shared int i;
   (new Thread({ i++; })).start();
}

Nov 13 2012

"luka8088" <luka8088 owave.net> writes:

On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to 
 newcomers about D's
 concurrency support because the most fundamental part of it 
 -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms 
 on that
 type work in a concurrent context, regardless of what is done 
 with
 memory barriers. Memory barriers ensure sequential 
 consistency, they do
 nothing for race conditions that are sequentially consistent. 
 Remember,
 single core CPUs are all sequentially consistent, and still 
 have major
 concurrency problems. This also means that having templates 
 accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.

 2. The idea of shared adding memory barriers for access is not 
 going to
 ever work. Adding barriers has to be done by someone who knows 
 what
 they're doing for that particular use case, and the compiler 
 inserting
 them is not going to substitute.


 However, and this is a big however, having shared as 
 compiler-enforced
 self-documentation is immensely useful. It flags where and 
 when data is
 being shared. So, your algorithm won't compile when you pass 
 it a shared
 type? That is because it is NEVER GOING TO WORK with a shared 
 type. At
 least you get a compile time indication of this, rather than 
 random
 runtime corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.


 This clarifies a lot, but still a lot of people get confused 
 with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 and also with http://dlang.org/faq.html#shared_guarantees said, 
 I come to think that the fact that the following code compiles 
 is either lack of implementation, a compiler bug or a faq error 
 ?

 //////////

 import core.thread;

 void main () {
   shared int i;
   (new Thread({ i++; })).start();
 }

Um, sorry, the following code:

//////////

import core.thread;

void main () {
   int i;
   (new Thread({ i++; })).start();
}

Nov 13 2012

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

Am 13.11.2012 10:14, schrieb luka8088:
 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms on that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they do
 nothing for race conditions that are sequentially consistent. Remember,
 single core CPUs are all sequentially consistent, and still have major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler inserting
 them is not going to substitute.


 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random
 runtime corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.


 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 and also with http://dlang.org/faq.html#shared_guarantees said, I come to
think that the fact that
 the following code compiles is either lack of implementation, a compiler bug
or a faq error ?

 //////////

 import core.thread;

 void main () {
   shared int i;
   (new Thread({ i++; })).start();
 }

 
 Um, sorry, the following code:
 
 //////////
 
 import core.thread;
 
 void main () {
   int i;
   (new Thread({ i++; })).start();
 }
 

Only std.concurrency (using spawn() and send()) enforces that unshared data
cannot be pass between
threads. The core.thread module is just a low-level module that just represents
the OS functionality.

Nov 13 2012

luka8088 <luka8088 owave.net> writes:

On 13.11.2012 10:20, Sönke Ludwig wrote:
 Am 13.11.2012 10:14, schrieb luka8088:
 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms on that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they do
 nothing for race conditions that are sequentially consistent. Remember,
 single core CPUs are all sequentially consistent, and still have major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler inserting
 them is not going to substitute.


 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random
 runtime corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.


 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 and also with http://dlang.org/faq.html#shared_guarantees said, I come to
think that the fact that
 the following code compiles is either lack of implementation, a compiler bug
or a faq error ?

 //////////

 import core.thread;

 void main () {
    shared int i;
    (new Thread({ i++; })).start();
 }

 Um, sorry, the following code:

 //////////

 import core.thread;

 void main () {
    int i;
    (new Thread({ i++; })).start();
 }

 Only std.concurrency (using spawn() and send()) enforces that unshared data
cannot be pass between
 threads. The core.thread module is just a low-level module that just
represents the OS functionality.

In that case http://dlang.org/faq.html#shared_guarantees is wrong, it is 
not a correct guarantee. Or at least that should be noted there. If 
nothing else it is confusing...

Nov 13 2012

"David Nadlinger" <see klickverbot.at> writes:

On Tuesday, 13 November 2012 at 10:06:12 UTC, luka8088 wrote:
 On 13.11.2012 10:20, Sönke Ludwig wrote:
 Only std.concurrency (using spawn() and send()) enforces that 
 unshared data cannot be pass between
 threads. The core.thread module is just a low-level module 
 that just represents the OS functionality.

 In that case http://dlang.org/faq.html#shared_guarantees is 
 wrong, it is not a correct guarantee. Or at least that should 
 be noted there. If nothing else it is confusing...

You are right, it could probably be added to avoid confusion. But 
then, non- safe code is not guaranteed to maintain any type 
system invariants at all if you don't pay attention to what its 
requirements are, so memory sharing is not really special in that 
regard…

David

Nov 13 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 13, 2012, at 1:14 AM, luka8088 <luka8088 owave.net> wrote:

 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex R=F8nne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers =




about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

=20
 I think a couple things are clear:
=20
 1. Slapping shared on a type is never going to make algorithms on =



that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they =



do
 nothing for race conditions that are sequentially consistent. =



Remember,
 single core CPUs are all sequentially consistent, and still have =



major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.
=20
 2. The idea of shared adding memory barriers for access is not going =



to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler =



inserting
 them is not going to substitute.
=20
=20
 However, and this is a big however, having shared as =



compiler-enforced
 self-documentation is immensely useful. It flags where and when data =



is
 being shared. So, your algorithm won't compile when you pass it a =



shared
 type? That is because it is NEVER GOING TO WORK with a shared type. =



At
 least you get a compile time indication of this, rather than random
 runtime corruption.
=20
 To make a shared type work in an algorithm, you have to:
=20
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex
=20
 Also, all op=3D need to be disabled for shared types.

=20
=20
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?
=20
 and also with http://dlang.org/faq.html#shared_guarantees said, I =


come to think that the fact that the following code compiles is either =
lack of implementation, a compiler bug or a faq error ?
=20
 //////////
=20
 import core.thread;
=20
 void main () {
  int i;
  (new Thread({ i++; })).start();
 }

It's intentional.  core.thread is for people who know what they're =
doing, and there are legitimate uses along these lines:

void main() {
    int i;
    auto t =3D new Thread({i++;});
    t.start();
    t.join();
    write(i);
}

This is perfectly safe and has a deterministic result.=

Nov 14 2012

luka8088 <luka8088 owave.net> writes:

On 14.11.2012 20:54, Sean Kelly wrote:
 On Nov 13, 2012, at 1:14 AM, luka8088<luka8088 owave.net>  wrote:

 On Tuesday, 13 November 2012 at 09:11:15 UTC, luka8088 wrote:
 On 12.11.2012 3:30, Walter Bright wrote:
 On 11/11/2012 10:46 AM, Alex Rønne Petersen wrote:
 It's starting to get outright embarrassing to talk to newcomers about D's
 concurrency support because the most fundamental part of it -- the
 shared type
 qualifier -- does not have well-defined semantics at all.

 I think a couple things are clear:

 1. Slapping shared on a type is never going to make algorithms on that
 type work in a concurrent context, regardless of what is done with
 memory barriers. Memory barriers ensure sequential consistency, they do
 nothing for race conditions that are sequentially consistent. Remember,
 single core CPUs are all sequentially consistent, and still have major
 concurrency problems. This also means that having templates accept
 shared(T) as arguments and have them magically generate correct
 concurrent code is a pipe dream.

 2. The idea of shared adding memory barriers for access is not going to
 ever work. Adding barriers has to be done by someone who knows what
 they're doing for that particular use case, and the compiler inserting
 them is not going to substitute.


 However, and this is a big however, having shared as compiler-enforced
 self-documentation is immensely useful. It flags where and when data is
 being shared. So, your algorithm won't compile when you pass it a shared
 type? That is because it is NEVER GOING TO WORK with a shared type. At
 least you get a compile time indication of this, rather than random
 runtime corruption.

 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 Also, all op= need to be disabled for shared types.


 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 and also with http://dlang.org/faq.html#shared_guarantees said, I come to
think that the fact that the following code compiles is either lack of
implementation, a compiler bug or a faq error ?

 //////////

 import core.thread;

 void main () {
   int i;
   (new Thread({ i++; })).start();
 }

 It's intentional.  core.thread is for people who know what they're doing, and
there are legitimate uses along these lines:

 void main() {
      int i;
      auto t = new Thread({i++;});
      t.start();
      t.join();
      write(i);
 }

 This is perfectly safe and has a deterministic result.

Yes, that makes perfect sense... I just wanted to point out the 
misguidance in FAQ because (at least before this forum thread) there is 
not much written about shared and you can get a wrong idea from it (at 
least I did).

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

Andrei is a proponent of having shared to memory barriers, I disagree with him. 
We haven't convinced each other yet, so this is a bit up in the air.


 and also with http://dlang.org/faq.html#shared_guarantees said, I come to think
 that the fact that the following code compiles is either lack of
implementation,
 a compiler bug or a faq error ?

 //////////

 import core.thread;

 void main () {
    shared int i;
    (new Thread({ i++; })).start();
 }

I think it's a user bug.

Nov 13 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 13 November 2012 at 21:29:13 UTC, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused 
 with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I 
 disagree with him. We haven't convinced each other yet, so this 
 is a bit up in the air.

FWIW, I'm with you on this one. Memory barriers would not make 
shared more useful, as they do not solve the issue with 
concurrency (as you have explained earlier).

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I disagree
 with him. We haven't convinced each other yet, so this is a bit up in
 the air.

Wait, then what would shared do? This is new to me as I've always 
assumed you and I have the same view on this.

Andrei

Nov 13 2012

"Peter Alexander" <peter.alexander.au gmail.com> writes:

On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu 
wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused 
 with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I 
 disagree
 with him. We haven't convinced each other yet, so this is a 
 bit up in
 the air.

 Wait, then what would shared do? This is new to me as I've 
 always assumed you and I have the same view on this.

 Andrei

I'm speaking out of turn, but...

I'll flip that around: what would shared do if there were memory 
barriers?

Walter has said previously in this thread that shared is to be 
used to mark shared data, and disallow any potentially 
non-thread-safe operations. To use shared data, you need to 
manually lock it and then cast away the shared temporarily. This 
can be made more pleasant with library utilities.

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 2:07 PM, Peter Alexander wrote:
 On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I disagree
 with him. We haven't convinced each other yet, so this is a bit up in
 the air.

 Wait, then what would shared do? This is new to me as I've always
 assumed you and I have the same view on this.

 Andrei

 I'm speaking out of turn, but...

 I'll flip that around: what would shared do if there were memory barriers?

 Walter has said previously in this thread that shared is to be used to
 mark shared data, and disallow any potentially non-thread-safe
 operations. To use shared data, you need to manually lock it and then
 cast away the shared temporarily. This can be made more pleasant with
 library utilities.

Oh ok, thanks. That does make sense. There's been quite a bit of 
discussion between Bartosz, Walter, and myself about allowing 
transparent loads and stores as opposed to defining intrinsics x.load 
and x.store(y). In C++11 both transparent and implicit are allowed, and 
an emergent idiom is "already use the explicit versions because they 
clarify flow and cost".


Andrei

Nov 13 2012

deadalnix <deadalnix gmail.com> writes:

Le 13/11/2012 23:07, Peter Alexander a écrit :
 On Tuesday, 13 November 2012 at 21:56:21 UTC, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I disagree
 with him. We haven't convinced each other yet, so this is a bit up in
 the air.

 Wait, then what would shared do? This is new to me as I've always
 assumed you and I have the same view on this.

 Andrei

 I'm speaking out of turn, but...

 I'll flip that around: what would shared do if there were memory barriers?

 Walter has said previously in this thread that shared is to be used to
 mark shared data, and disallow any potentially non-thread-safe
 operations. To use shared data, you need to manually lock it and then
 cast away the shared temporarily. This can be made more pleasant with
 library utilities.

It cannot unless some ownership is introduced in D.

Nov 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I disagree
 with him. We haven't convinced each other yet, so this is a bit up in
 the air.

 Wait, then what would shared do? This is new to me as I've always assumed you
 and I have the same view on this.

I'm just not convinced that having the compiler add memory barriers:

1. will result in correctly working code, when done by programmers who have
only 
an incomplete understanding of memory barriers, which would be about 99.9% of
us.

2. will result in efficient code

I also worry that it will lure programmers into a false sense of complacency 
about shared, that simply adding "shared" to a type will make their concurrent 
code work. Few seem to realize that adding memory barriers only makes code 
sequentially consistent, it does *not* eliminate race conditions. It just turns 
a multicore machine into (logically) a single core one, *not* a single threaded
one.

But I do see enormous value in shared in that it logically (and rather 
forcefully) separates thread-local code from multi-thread code. For example,
see 
the post here about adding a destructor to a shared struct, and having it fail 
to compile. The complaint was along the lines of shared being broken, whereas I 
viewed it along the lines of shared pointing out a logic problem in the code - 
what does destroying a struct accessible from multiple threads mean? I think it 
must be clear that destroying an object can only happen in one thread, i.e. the 
object must become thread local in order to be destroyed.

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 2:22 PM, Walter Bright wrote:
 On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I disagree
 with him. We haven't convinced each other yet, so this is a bit up in
 the air.

 Wait, then what would shared do? This is new to me as I've always
 assumed you
 and I have the same view on this.

 I'm just not convinced that having the compiler add memory barriers:

 1. will result in correctly working code, when done by programmers who
 have only an incomplete understanding of memory barriers, which would be
 about 99.9% of us.

 2. will result in efficient code

I'm fine with these arguments. We'll need to break current uses of 
shared then. What you say is that essentially you can't do even this:

shared int x;
...
x = 4;

You'll need to use x.load(4) instead.

Just for the record I'm okay with this breakage.

 I also worry that it will lure programmers into a false sense of
 complacency about shared, that simply adding "shared" to a type will
 make their concurrent code work. Few seem to realize that adding memory
 barriers only makes code sequentially consistent, it does *not*
 eliminate race conditions.

It does eliminate all low-level races.

 It just turns a multicore machine into
 (logically) a single core one, *not* a single threaded one.

This is very approximate.

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.

As long as a cast is required along the way, we can't claim victory. I 
need to think about that scenario.


Andrei

Nov 13 2012

"David Nadlinger" <see klickverbot.at> writes:

On Tuesday, 13 November 2012 at 22:33:51 UTC, Andrei Alexandrescu 
wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.load(4) instead.

You mean x.store(4)? Or am I completely misunderstanding your 
message?

David

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 3:07 PM, David Nadlinger wrote:
 On Tuesday, 13 November 2012 at 22:33:51 UTC, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.load(4) instead.

 You mean x.store(4)? Or am I completely misunderstanding your message?

 David

Apologies, yes, store.

Andrei

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 On 11/13/12 2:22 PM, Walter Bright wrote:
 On 11/13/2012 1:56 PM, Andrei Alexandrescu wrote:
 On 11/13/12 1:28 PM, Walter Bright wrote:
 On 11/13/2012 1:11 AM, luka8088 wrote:
 This clarifies a lot, but still a lot of people get confused with:
 http://dlang.org/faq.html#shared_memory_barriers
 is it a faq error ?

 Andrei is a proponent of having shared to memory barriers, I disagree
 with him. We haven't convinced each other yet, so this is a bit up in
 the air.

 Wait, then what would shared do? This is new to me as I've always
 assumed you
 and I have the same view on this.

 I'm just not convinced that having the compiler add memory barriers:

 1. will result in correctly working code, when done by programmers who
 have only an incomplete understanding of memory barriers, which would be
 about 99.9% of us.

 2. will result in efficient code

 I'm fine with these arguments. We'll need to break current uses of
 shared then. What you say is that essentially you can't do even this:

 shared int x;
 ...
 x = 4;

 You'll need to use x.load(4) instead.

Is that meant to be an atomic store, or just a regular, but explicit, store?

(I know you meant store.)

 Just for the record I'm okay with this breakage.

 I also worry that it will lure programmers into a false sense of
 complacency about shared, that simply adding "shared" to a type will
 make their concurrent code work. Few seem to realize that adding memory
 barriers only makes code sequentially consistent, it does *not*
 eliminate race conditions.

 It does eliminate all low-level races.

 It just turns a multicore machine into
 (logically) a single core one, *not* a single threaded one.

 This is very approximate.

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.

 As long as a cast is required along the way, we can't claim victory. I
 need to think about that scenario.


 Andrei


-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.

 Is that meant to be an atomic store, or just a regular, but explicit,
 store?

Atomic and sequentially consistent.


Andrei

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.

 Is that meant to be an atomic store, or just a regular, but explicit,
 store?

 Atomic and sequentially consistent.


 Andrei

OK, but then we have the problem I presented in the OP: This only works 
for certain types, on certain architectures, for certain processors, ...

So, we could limit shared load/store to only work on certain types and 
require all architectures that D compilers target to provide those. 
*But* this means that shared on any non-primitive types becomes 
essentially useless and will in 99% of cases just be casted away. On the 
other hand, if we make it implementation-defined, people end up writing 
highly unportable code. So, (unless anyone can come up with better 
alternatives), I think guaranteeing atomic load/store for a certain set 
of types is the most sensible way forward.

FWIW, these are the types and type categories I'd expect shared 
load/store to work on, on any architecture:

* ubyte, byte
* ushort, short
* uint, int
* ulong, long
* float, double
* pointers
* slices
* references
* function pointers
* delegates

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 00:43, Alex Rønne Petersen wrote:
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.

 Is that meant to be an atomic store, or just a regular, but explicit,
 store?

 Atomic and sequentially consistent.


 Andrei

 OK, but then we have the problem I presented in the OP: This only works
 for certain types, on certain architectures, for certain processors, ...

 So, we could limit shared load/store to only work on certain types and
 require all architectures that D compilers target to provide those.
 *But* this means that shared on any non-primitive types becomes
 essentially useless and will in 99% of cases just be casted away. On the
 other hand, if we make it implementation-defined, people end up writing
 highly unportable code. So, (unless anyone can come up with better
 alternatives), I think guaranteeing atomic load/store for a certain set
 of types is the most sensible way forward.

 FWIW, these are the types and type categories I'd expect shared
 load/store to work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

Scratch that, make it this:

* ubyte, byte
* ushort, short
* uint, int
* ulong, long
* float, double
* pointers
* references
* function pointers

Slices and delegates can't be loaded/stored atomically because very few 
architectures provide instructions to atomically load/store 16 bytes of 
data (required on 64-bit; 32-bit would be fine since that's just 8 
bytes, but portability is king). This is also why ucent, cent, and real 
are not included in the list.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 00:48, Alex Rønne Petersen a écrit :
 On 14-11-2012 00:43, Alex Rønne Petersen wrote:
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.

 Is that meant to be an atomic store, or just a regular, but explicit,
 store?

 Atomic and sequentially consistent.


 Andrei

 OK, but then we have the problem I presented in the OP: This only works
 for certain types, on certain architectures, for certain processors, ...

 So, we could limit shared load/store to only work on certain types and
 require all architectures that D compilers target to provide those.
 *But* this means that shared on any non-primitive types becomes
 essentially useless and will in 99% of cases just be casted away. On the
 other hand, if we make it implementation-defined, people end up writing
 highly unportable code. So, (unless anyone can come up with better
 alternatives), I think guaranteeing atomic load/store for a certain set
 of types is the most sensible way forward.

 FWIW, these are the types and type categories I'd expect shared
 load/store to work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

 Scratch that, make it this:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * references
 * function pointers

 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.

That list sound more reasonable.

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.

When I wrote TDPL I looked at the contemporary architectures and it 
seemed all were or were about to support double-word atomic ops. So the 
intent is to allow shared delegates and slices.

Are there any architectures today that don't support double-word load, 
store, and CAS?


Andrei

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.

 When I wrote TDPL I looked at the contemporary architectures and it
 seemed all were or were about to support double-word atomic ops. So the
 intent is to allow shared delegates and slices.

 Are there any architectures today that don't support double-word load,
 store, and CAS?


 Andrei

I do not know of a single architecture apart from x86 that supports > 
8-byte load/store/CAS (and come to think of it, I'm not so sure x86 
actually can do 16-byte load/store, only CAS). So while a shared 
delegate is doable in 32-bit, it isn't really in 64-bit.

(I deliberately talk in terms of bytes here because that's the 
nomenclature most architecture manuals use from what I've seen.)

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
 On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.

 When I wrote TDPL I looked at the contemporary architectures and it
 seemed all were or were about to support double-word atomic ops. So the
 intent is to allow shared delegates and slices.

 Are there any architectures today that don't support double-word load,
 store, and CAS?


 Andrei

 I do not know of a single architecture apart from x86 that supports >
 8-byte load/store/CAS (and come to think of it, I'm not so sure x86
 actually can do 16-byte load/store, only CAS). So while a shared
 delegate is doable in 32-bit, it isn't really in 64-bit.

Intel does 128-bit atomic load and store, see 
http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-
-vol-2-manual.html, 
"4.5 Memory Datum Alignment and Atomicity".

Andrei

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 03:02, Andrei Alexandrescu wrote:
 On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
 On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very few
 architectures provide instructions to atomically load/store 16 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and real
 are not included in the list.

 When I wrote TDPL I looked at the contemporary architectures and it
 seemed all were or were about to support double-word atomic ops. So the
 intent is to allow shared delegates and slices.

 Are there any architectures today that don't support double-word load,
 store, and CAS?


 Andrei

 I do not know of a single architecture apart from x86 that supports >
 8-byte load/store/CAS (and come to think of it, I'm not so sure x86
 actually can do 16-byte load/store, only CAS). So while a shared
 delegate is doable in 32-bit, it isn't really in 64-bit.

 Intel does 128-bit atomic load and store, see
 http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html,
 "4.5 Memory Datum Alignment and Atomicity".

 Andrei

That's Itanium, though, not x86. Itanium is a fairly high-end, 
enterprise-class thing, so that's not very surprising.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

Rainer Schuetze <r.sagitario gmx.de> writes:

On 11/14/2012 3:05 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 03:02, Andrei Alexandrescu wrote:
 On 11/13/12 5:58 PM, Alex Rønne Petersen wrote:
 On 14-11-2012 02:52, Andrei Alexandrescu wrote:
 On 11/13/12 3:48 PM, Alex Rønne Petersen wrote:
 Slices and delegates can't be loaded/stored atomically because very
 few
 architectures provide instructions to atomically load/store 16
 bytes of
 data (required on 64-bit; 32-bit would be fine since that's just 8
 bytes, but portability is king). This is also why ucent, cent, and
 real
 are not included in the list.

 When I wrote TDPL I looked at the contemporary architectures and it
 seemed all were or were about to support double-word atomic ops. So the
 intent is to allow shared delegates and slices.

 Are there any architectures today that don't support double-word load,
 store, and CAS?


 Andrei

 I do not know of a single architecture apart from x86 that supports >
 8-byte load/store/CAS (and come to think of it, I'm not so sure x86
 actually can do 16-byte load/store, only CAS). So while a shared
 delegate is doable in 32-bit, it isn't really in 64-bit.

 Intel does 128-bit atomic load and store, see
 http://www.intel.com/content/www/us/en/processors/itanium/itanium-architecture-software-developer-rev-2-3-vol-2-manual.html,

 "4.5 Memory Datum Alignment and Atomicity".

 Andrei

 That's Itanium, though, not x86. Itanium is a fairly high-end,
 enterprise-class thing, so that's not very surprising.

On x86 you can use LOCK CMPXCHG16b to do the atomic read: 
http://stackoverflow.com/questions/9726566/atomic-16-byte-read-on-x64-cpus

This just excludes a small number of early AMD processors.

Nov 14 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 00:43, Alex Rønne Petersen a écrit :
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.

 Is that meant to be an atomic store, or just a regular, but explicit,
 store?

 Atomic and sequentially consistent.


 Andrei

 OK, but then we have the problem I presented in the OP: This only works
 for certain types, on certain architectures, for certain processors, ...

 So, we could limit shared load/store to only work on certain types and
 require all architectures that D compilers target to provide those.
 *But* this means that shared on any non-primitive types becomes
 essentially useless and will in 99% of cases just be casted away. On the
 other hand, if we make it implementation-defined, people end up writing
 highly unportable code. So, (unless anyone can come up with better
 alternatives), I think guaranteeing atomic load/store for a certain set
 of types is the most sensible way forward.

 FWIW, these are the types and type categories I'd expect shared
 load/store to work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

I wouldn't expected it to work for delegates, long, ulong, double and 
slice on every arch. If it does work, that is awesome, and add to my 
determination that this is the thing to do.

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 01:09, deadalnix wrote:
 Le 14/11/2012 00:43, Alex Rønne Petersen a écrit :
 On 14-11-2012 00:38, Andrei Alexandrescu wrote:
 On 11/13/12 3:28 PM, Alex Rønne Petersen wrote:
 On 13-11-2012 23:33, Andrei Alexandrescu wrote:
 shared int x;
 ...
 x = 4;

 You'll need to use x.store(4) instead.

 Is that meant to be an atomic store, or just a regular, but explicit,
 store?

 Atomic and sequentially consistent.


 Andrei

 OK, but then we have the problem I presented in the OP: This only works
 for certain types, on certain architectures, for certain processors, ...

 So, we could limit shared load/store to only work on certain types and
 require all architectures that D compilers target to provide those.
 *But* this means that shared on any non-primitive types becomes
 essentially useless and will in 99% of cases just be casted away. On the
 other hand, if we make it implementation-defined, people end up writing
 highly unportable code. So, (unless anyone can come up with better
 alternatives), I think guaranteeing atomic load/store for a certain set
 of types is the most sensible way forward.

 FWIW, these are the types and type categories I'd expect shared
 load/store to work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

 I wouldn't expected it to work for delegates, long, ulong, double and
 slice on every arch. If it does work, that is awesome, and add to my
 determination that this is the thing to do.

8-byte atomic loads/stores is doable on all major architectures.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 5:33 PM, Alex Rønne Petersen wrote:
 8-byte atomic loads/stores is doable on all major architectures.

We're looking at 128-bit load, store, and CAS for 64-bit machines.

Andrei

Nov 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

Not going to portably work on long, ulong, double, slices, or delegates.

(The compiler should issue an error where it won't work, and allow it where it 
does, letting the user decide what to do about the non-working cases.)

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 02:33, Walter Bright wrote:
 On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared
 load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

 Not going to portably work on long, ulong, double, slices, or delegates.

 (The compiler should issue an error where it won't work, and allow it
 where it does, letting the user decide what to do about the non-working
 cases.)

I amended that (see my other post). 8-byte loads/stores can be done 
atomically on all relevant architectures today. Andrei linked a page a 
while back that explained how to do it on x86, ARM, MIPS, and PowerPC 
(if memory serves), but I can't seem to find it again...

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 02:36, Alex Rønne Petersen a écrit :
 On 14-11-2012 02:33, Walter Bright wrote:
 On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared
 load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

 Not going to portably work on long, ulong, double, slices, or delegates.

 (The compiler should issue an error where it won't work, and allow it
 where it does, letting the user decide what to do about the non-working
 cases.)

 I amended that (see my other post). 8-byte loads/stores can be done
 atomically on all relevant architectures today. Andrei linked a page a
 while back that explained how to do it on x86, ARM, MIPS, and PowerPC
 (if memory serves), but I can't seem to find it again...

http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

Nov 13 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 03:00, deadalnix wrote:
 Le 14/11/2012 02:36, Alex Rønne Petersen a écrit :
 On 14-11-2012 02:33, Walter Bright wrote:
 On 11/13/2012 3:43 PM, Alex Rønne Petersen wrote:
 FWIW, these are the types and type categories I'd expect shared
 load/store to
 work on, on any architecture:

 * ubyte, byte
 * ushort, short
 * uint, int
 * ulong, long
 * float, double
 * pointers
 * slices
 * references
 * function pointers
 * delegates

 Not going to portably work on long, ulong, double, slices, or delegates.

 (The compiler should issue an error where it won't work, and allow it
 where it does, letting the user decide what to do about the non-working
 cases.)

 I amended that (see my other post). 8-byte loads/stores can be done
 atomically on all relevant architectures today. Andrei linked a page a
 while back that explained how to do it on x86, ARM, MIPS, and PowerPC
 (if memory serves), but I can't seem to find it again...

 http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

Thanks, exactly that. No MIPS, though. I guess I'm going to have to go 
dig through their manuals.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 2:33 PM, Andrei Alexandrescu wrote:
 As long as a cast is required along the way, we can't claim victory. I need to
 think about that scenario.

Our car doesn't have an electric starter yet, but it's still better than a
horse :-)

Nov 13 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 5:29 PM, Walter Bright wrote:
 On 11/13/2012 2:33 PM, Andrei Alexandrescu wrote:
 As long as a cast is required along the way, we can't claim victory. I
 need to
 think about that scenario.

 Our car doesn't have an electric starter yet, but it's still better than
 a horse :-)

Please don't. This is "we're doing better than C++" in disguise and 
exactly the wrong frame of mind. I find few things more negatively 
disruptive than lulling into a false sense of achievement.

Andrei

Nov 13 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, November 13, 2012 14:33:50 Andrei Alexandrescu wrote:
 As long as a cast is required along the way, we can't claim victory. I
 need to think about that scenario.

At this point, I don't see how it could be otherwise. Having the shared 
equivalent of const would just lead to that being used everywhere and defeat 
the purpose of shared in the first place. If it's not segregated, it's not 
doing its job. But that leaves us with most functions not working with shared, 
which is also a problem. Templates are a partial solution, but they obviously 
don't work for everything.

In general, I would expect that all uses of shared would be protected by a 
mutex or synchronized block or other similar construct. It's just going to 
cause problems to do otherwise. There are some cases where if you can 
guarantee that writes and reads are atomic, you're fine skipping the mutexes, 
but those are relatively rare, particularly when you consider the issues in 
making anything but extremely trivial writes or reads atomic.

That being the case, it doesn't really seem all that unreasonable to me for it 
to be normal to have to cast shared to non-shared to pass to functions as long 
as all of that code is protected with a mutex or another, similar construct - 
though if those functions aren't pure, you _could_ run into entertaining 
problems when a non-shared reference to the data gets wiled away somewhere in 
those function calls.

But we seem to have contradictory requirements here of trying to segregate 
shared from normal, thread-local stuff but are still looking to be able to use 
shared with functions intended to be used with non-shared stuff.

- Jonathan M Davis

Nov 13 2012

deadalnix <deadalnix gmail.com> writes:

Le 13/11/2012 23:22, Walter Bright a écrit :
 I'm just not convinced that having the compiler add memory barriers:

 1. will result in correctly working code, when done by programmers who
 have only an incomplete understanding of memory barriers, which would be
 about 99.9% of us.

 2. will result in efficient code

 I also worry that it will lure programmers into a false sense of
 complacency about shared, that simply adding "shared" to a type will
 make their concurrent code work. Few seem to realize that adding memory
 barriers only makes code sequentially consistent, it does *not*
 eliminate race conditions. It just turns a multicore machine into
 (logically) a single core one, *not* a single threaded one.

That is what java's volatile do. It have several uses cases, including 
valid double check locking (It has to be noted that this idiom is used 
incorrectly in druntime ATM, which proves both its usefullness and that 
it require language support) and disruptor which I wanted to implement 
for message passing in D but couldn't because of lack of support at the 
time.

See: http://www.slideshare.net/trishagee/introduction-to-the-disruptor

So sequentially consistent read/write are usefull.

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.


language multithread, have everything shared and still are able to have 
finalizer of some sort.

Why couldn't a shared object be destroyed ? Why should it be destroyed 
in a specific thread as it can only refer shared data because of 
transitivity ?

Nov 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 4:04 PM, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including valid
 double check locking (It has to be noted that this idiom is used incorrectly in
 druntime ATM,

Please, please file a bug report about this, rather than a vague statement
here. 
If there already is one, please post its number.


 So sequentially consistent read/write are usefull.

Sure, I agree with that.



 multithread, have everything shared and still are able to have finalizer of
some
 sort.

I understand, though, that they take steps to ensure that the finalizer is run 
in one thread and no other thread still has access to it - i.e. it is converted 
back to a local reference.

 Why couldn't a shared object be destroyed ? Why should it be destroyed in a
 specific thread as it can only refer shared data because of transitivity ?

How can you destroy an object in one thread when another thread holding live 
references to it? (Well, how can you destroy it without causing corruption
bugs, 
that is.)

Nov 13 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 02:39, Walter Bright a écrit :
 On 11/13/2012 4:04 PM, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid
 double check locking (It has to be noted that this idiom is used
 incorrectly in
 druntime ATM,

 Please, please file a bug report about this, rather than a vague
 statement here. If there already is one, please post its number.

http://d.puremagic.com/issues/show_bug.cgi?id=6607

 So sequentially consistent read/write are usefull.

 Sure, I agree with that.



 language
 multithread, have everything shared and still are able to have
 finalizer of some
 sort.

 I understand, though, that they take steps to ensure that the finalizer
 is run in one thread and no other thread still has access to it - i.e.
 it is converted back to a local reference.

 Why couldn't a shared object be destroyed ? Why should it be destroyed
 in a
 specific thread as it can only refer shared data because of
 transitivity ?

 How can you destroy an object in one thread when another thread holding
 live references to it? (Well, how can you destroy it without causing
 corruption bugs, that is.)

Why would you destroy something that isn't dead yet ?

Nov 13 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, 
 including valid double check locking (It has to be noted that 
 this idiom is used incorrectly in druntime ATM, which proves 
 both its usefullness and that it require language support) and 
 disruptor which I wanted to implement for message passing in D 
 but couldn't because of lack of support at the time.

What stops you from using core.atomic.{atomicLoad, atomicStore}? 
I don't know whether there might be a weird spec loophole which 
could theoretically lead to them being undefined behavior, but 
I'm sure that they are guaranteed to produce the right code on 
all relevant compilers. You can even specify the memory order 
semantics if you know what you are doing (although this used to 
trigger a template resolution bug in the frontend, no idea if it 
works now).

David

Nov 14 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 13:23, David Nadlinger a écrit :
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.

 What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't
 know whether there might be a weird spec loophole which could
 theoretically lead to them being undefined behavior, but I'm sure that
 they are guaranteed to produce the right code on all relevant compilers.
 You can even specify the memory order semantics if you know what you are
 doing (although this used to trigger a template resolution bug in the
 frontend, no idea if it works now).

 David

It is a solution now (it wasn't at the time).

The main drawback with that solution is that the compiler can't optimize 
thread local read/write regardless of shared read/write. This is wasted 
opportunity.

Nov 14 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 13:19:12 UTC, deadalnix wrote:
 The main drawback with that solution is that the compiler can't 
 optimize thread local read/write regardless of shared 
 read/write. This is wasted opportunity.

You mean moving non-atomic loads/stores across atomic 
instructions? This is simply a matter of the compiler providing 
the right intrinsics for implementing the core.atomic functions. 
LDC already does it.

David

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.

 What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't
 know whether there might be a weird spec loophole which could
 theoretically lead to them being undefined behavior, but I'm sure that
 they are guaranteed to produce the right code on all relevant compilers.
 You can even specify the memory order semantics if you know what you are
 doing (although this used to trigger a template resolution bug in the
 frontend, no idea if it works now).

 David

This is a simplification of what should be going on. The 
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so 
the compiler generate sequentially consistent code with them (i.e. not 
perform certain reorderings). Then there are loads and stores with 
weaker consistency semantics (acquire, release, acquire/release, and 
consume).

Andrei

Nov 14 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 15:32, Andrei Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.

 What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't
 know whether there might be a weird spec loophole which could
 theoretically lead to them being undefined behavior, but I'm sure that
 they are guaranteed to produce the right code on all relevant compilers.
 You can even specify the memory order semantics if you know what you are
 doing (although this used to trigger a template resolution bug in the
 frontend, no idea if it works now).

 David

 This is a simplification of what should be going on. The
 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so
 the compiler generate sequentially consistent code with them (i.e. not
 perform certain reorderings). Then there are loads and stores with
 weaker consistency semantics (acquire, release, acquire/release, and
 consume).

 Andrei

They already work as they should:

* DMD: They use inline asm, so they're guaranteed to not be reordered. 
Calls aren't reordered with DMD either, so even if the former wasn't the 
case, it'd still work.
* GDC: They map directly to the GCC __sync_* builtins, which have the 
semantics you describe (with full sequential consistency).
* LDC: They map to LLVM's load/store instructions with the atomic flag 
set and with the given atomic consistency, which have the semantics you 
describe.

I don't think there's anything that actually needs to be fixed there.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 7:11 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:32, Andrei Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.

 What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't
 know whether there might be a weird spec loophole which could
 theoretically lead to them being undefined behavior, but I'm sure that
 they are guaranteed to produce the right code on all relevant compilers.
 You can even specify the memory order semantics if you know what you are
 doing (although this used to trigger a template resolution bug in the
 frontend, no idea if it works now).

 David

 This is a simplification of what should be going on. The
 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so
 the compiler generate sequentially consistent code with them (i.e. not
 perform certain reorderings). Then there are loads and stores with
 weaker consistency semantics (acquire, release, acquire/release, and
 consume).

 Andrei

 They already work as they should:

 * DMD: They use inline asm, so they're guaranteed to not be reordered.
 Calls aren't reordered with DMD either, so even if the former wasn't the
 case, it'd still work.
 * GDC: They map directly to the GCC __sync_* builtins, which have the
 semantics you describe (with full sequential consistency).
 * LDC: They map to LLVM's load/store instructions with the atomic flag
 set and with the given atomic consistency, which have the semantics you
 describe.

 I don't think there's anything that actually needs to be fixed there.

The language definition should be made clear so as future optimizations 
of existing implementations, and future implementations, don't push 
things over the limit.

Andrei

Nov 14 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei 
Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix 
 wrote:
 That is what java's volatile do. It have several uses cases, 
 including
 valid double check locking (It has to be noted that this 
 idiom is used
 incorrectly in druntime ATM, which proves both its 
 usefullness and
 that it require language support) and disruptor which I 
 wanted to
 implement for message passing in D but couldn't because of 
 lack of
 support at the time.

 What stops you from using core.atomic.{atomicLoad, 
 atomicStore}? I don't
 know whether there might be a weird spec loophole which could
 theoretically lead to them being undefined behavior, but I'm 
 sure that
 they are guaranteed to produce the right code on all relevant 
 compilers.
 You can even specify the memory order semantics if you know 
 what you are
 doing (although this used to trigger a template resolution bug 
 in the
 frontend, no idea if it works now).

 David

 This is a simplification of what should be going on. The 
 core.atomic.{atomicLoad, atomicStore} functions must be 
 intrinsics so the compiler generate sequentially consistent 
 code with them (i.e. not perform certain reorderings). Then 
 there are loads and stores with weaker consistency semantics 
 (acquire, release, acquire/release, and consume).

Sorry, I don't quite see where I simplified things. Yes, in the 
implementation of atomicLoad/atomicStore, one would probably use 
compiler intrinsics, as done in LDC's druntime, or inline 
assembly, as done for DMD.

But an optimizer will never move instructions across opaque 
function calls, because they could have arbitrary side effects. 
So, either we are fine by definition, or if the compiler inlines 
the atomicLoad/atomicStore calls (which is actually possible in 
LDC), then its optimizer will detect the presence of inline 
assembly resp. the load/store intrinsics, and take care of not 
reordering the instructions in an invalid way.

I don't see how this makes my answer to deadalnix (that 
»volatile« is not necessary to implement sequentially 
consistent loads/stores) any less valid.

David

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 8:59 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:32:34 UTC, Andrei Alexandrescu wrote:
 On 11/14/12 4:23 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 00:04:56 UTC, deadalnix wrote:
 That is what java's volatile do. It have several uses cases, including
 valid double check locking (It has to be noted that this idiom is used
 incorrectly in druntime ATM, which proves both its usefullness and
 that it require language support) and disruptor which I wanted to
 implement for message passing in D but couldn't because of lack of
 support at the time.

 What stops you from using core.atomic.{atomicLoad, atomicStore}? I don't
 know whether there might be a weird spec loophole which could
 theoretically lead to them being undefined behavior, but I'm sure that
 they are guaranteed to produce the right code on all relevant compilers.
 You can even specify the memory order semantics if you know what you are
 doing (although this used to trigger a template resolution bug in the
 frontend, no idea if it works now).

 David

 This is a simplification of what should be going on. The
 core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so
 the compiler generate sequentially consistent code with them (i.e. not
 perform certain reorderings). Then there are loads and stores with
 weaker consistency semantics (acquire, release, acquire/release, and
 consume).

 Sorry, I don't quite see where I simplified things.

First, there are more kinds of atomic loads and stores. Then, the fact 
that the calls are not supposed to be reordered must be a guarantee of 
the language, not a speculation about an implementation. We can't argue 
that a feature works just because it so happens an implementation works 
a specific way.

 Yes, in the
 implementation of atomicLoad/atomicStore, one would probably use
 compiler intrinsics, as done in LDC's druntime, or inline assembly, as
 done for DMD.

 But an optimizer will never move instructions across opaque function
 calls, because they could have arbitrary side effects.

Nowhere in the language definition is explained what an opaque function 
call is and what optimizations can and cannot be done in the presence of 
such.

 So, either we are
 fine by definition,

s/definition/happenstance/

 or if the compiler inlines the
 atomicLoad/atomicStore calls (which is actually possible in LDC), then
 its optimizer will detect the presence of inline assembly resp. the
 load/store intrinsics, and take care of not reordering the instructions
 in an invalid way.

 I don't see how this makes my answer to deadalnix (that »volatile« is
 not necessary to implement sequentially consistent loads/stores) any
 less valid.

Using load/store everywhere would make volatile unneeded (and for us, 
shared). But the advantage there is that you qualify the type/value once 
and then you don't need to remember to only use specific primitives to 
manipulate it.


Andrei

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 9:50 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:
=20
 First, there are more kinds of atomic loads and stores. Then, the fact =

that the calls are not supposed to be reordered must be a guarantee of =
the language, not a speculation about an implementation. We can't argue =
that a feature works just because it so happens an implementation works =
a specific way.

I've always been a fan of release consistency, and it dovetails well =
with the behavior of mutexes =
(http://en.wikipedia.org/wiki/Release_consistency).  It would be cool if =
we could sort out transactional memory as well, but that's not a short =
term thing.=

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 12:04 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 9:50 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 First, there are more kinds of atomic loads and stores. Then, the fact that
the calls are not supposed to be reordered must be a guarantee of the language,
not a speculation about an implementation. We can't argue that a feature works
just because it so happens an implementation works a specific way.

 I've always been a fan of release consistency, and it dovetails well with the
behavior of mutexes (http://en.wikipedia.org/wiki/Release_consistency).  It
would be cool if we could sort out transactional memory as well, but that's not
a short term thing.

I think we should focus on sequential consistency as that's where the 
industry is converging.

Andrei

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:
=20
 This is a simplification of what should be going on. The =

core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =
the compiler generate sequentially consistent code with them (i.e. not =
perform certain reorderings). Then there are loads and stores with =
weaker consistency semantics (acquire, release, acquire/release, and =
consume).

No.  These functions all contain volatile ask blocks.  If the compiler =
respected the "volatile" it would be enough.=

Nov 14 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 21:01, Sean Kelly a �crit :
 On Nov 14, 2012, at 6:32 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 This is a simplification of what should be going on. The
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the
compiler generate sequentially consistent code with them (i.e. not perform
certain reorderings). Then there are loads and stores with weaker consistency
semantics (acquire, release, acquire/release, and consume).

 No.  These functions all contain volatile ask blocks.  If the compiler
respected the "volatile" it would be enough.

It is sufficient for monocore and mostly correct for x86. But isn't enough.

volatile isn't for concurency, but memory mapping.

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 4:54 AM, deadalnix <deadalnix gmail.com> wrote:

 Le 14/11/2012 21:01, Sean Kelly a =E9crit :
 On Nov 14, 2012, at 6:32 AM, Andrei =


Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
=20
 This is a simplification of what should be going on. The =



core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =
the compiler generate sequentially consistent code with them (i.e. not =
perform certain reorderings). Then there are loads and stores with =
weaker consistency semantics (acquire, release, acquire/release, and =
consume).
=20
 No.  These functions all contain volatile ask blocks.  If the =


compiler respected the "volatile" it would be enough.
=20
 It is sufficient for monocore and mostly correct for x86. But isn't =

enough.
=20
 volatile isn't for concurency, but memory mapping.

Traditionally, the term "volatile" is for memory mapping.  The =
description of "volatile" for D1, though, would have worked for =
concurrency.  Or is there some example you can provide where this isn't =
true?=

Nov 15 2012

deadalnix <deadalnix gmail.com> writes:

Le 15/11/2012 17:33, Sean Kelly a �crit :
 On Nov 15, 2012, at 4:54 AM, deadalnix<deadalnix gmail.com>  wrote:

 Le 14/11/2012 21:01, Sean Kelly a �crit :
 On Nov 14, 2012, at 6:32 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>   wrote:
 This is a simplification of what should be going on. The
core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so the
compiler generate sequentially consistent code with them (i.e. not perform
certain reorderings). Then there are loads and stores with weaker consistency
semantics (acquire, release, acquire/release, and consume).

 No.  These functions all contain volatile ask blocks.  If the compiler
respected the "volatile" it would be enough.

 It is sufficient for monocore and mostly correct for x86. But isn't enough.

 volatile isn't for concurency, but memory mapping.

 Traditionally, the term "volatile" is for memory mapping.  The description of
"volatile" for D1, though, would have worked for concurrency.  Or is there some
example you can provide where this isn't true?

I'm not aware of D1 compiler inserting memory barrier, so any memory 
operation reordering done by the CPU would have screwed up.

Nov 16 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 12:01 PM, Sean Kelly <sean invisibleduck.org> wrote:

 On Nov 14, 2012, at 6:32 AM, Andrei Alexandrescu =

<SeeWebsiteForEmail erdani.org> wrote:
=20
 This is a simplification of what should be going on. The =


core.atomic.{atomicLoad, atomicStore} functions must be intrinsics so =
the compiler generate sequentially consistent code with them (i.e. not =
perform certain reorderings). Then there are loads and stores with =
weaker consistency semantics (acquire, release, acquire/release, and =
consume).
=20
 No.  These functions all contain volatile ask blocks.  If the compiler =

respected the "volatile" it would be enough.

asm blocks.  Darn auto-correct.=

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-13 23:22, Walter Bright wrote:

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.

If the compiler should/does not add memory barriers, then is there a 
reason for having it built into the language? Can a library solution be 
enough?

-- 
/Jacob Carlborg

Nov 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a reason for
 having it built into the language? Can a library solution be enough?

Memory barriers can certainly be added using library functions.

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.

Is there then any real advantage of having it directly in the language?

-- 
/Jacob Carlborg

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/14/2012 1:31 AM, Jacob Carlborg wrote:
 On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.

 Is there then any real advantage of having it directly in the language?

Not that I can think of.

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 11:38, Walter Bright wrote:

 Not that I can think of.

Then we might want to remove it since it's either not working or 
basically everyone has misunderstood how it should work.

-- 
/Jacob Carlborg

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 4:47 AM, Jacob Carlborg wrote:
 On 2012-11-14 11:38, Walter Bright wrote:

 Not that I can think of.

 Then we might want to remove it since it's either not working or
 basically everyone has misunderstood how it should work.

Actually this hypothesis is false.

Andrei

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 15:33, Andrei Alexandrescu wrote:

 Actually this hypothesis is false.

That we should remove it or that it's not working/nobody understands 
what it should do? If it's the latter then this thread is the evidence 
that my hypothesis is true.

-- 
/Jacob Carlborg

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 7:14 AM, Jacob Carlborg wrote:
 On 2012-11-14 15:33, Andrei Alexandrescu wrote:

 Actually this hypothesis is false.

 That we should remove it or that it's not working/nobody understands
 what it should do? If it's the latter then this thread is the evidence
 that my hypothesis is true.

The hypothesis that atomic primitives can be implemented as a library.

Andrei

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 18:36, Andrei Alexandrescu wrote:

 The hypothesis that atomic primitives can be implemented as a library.

I don't know these kind of things, that's why I'm asking.

-- 
/Jacob Carlborg

Nov 14 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 10:31, Jacob Carlborg a écrit :
 On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.

 Is there then any real advantage of having it directly in the language?

The compiler can do more reordering in regard to barriers. For instance, 
the compiler may reorder thread local read write accross the barrier.

This can't be done with a library solution.

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 12:04, deadalnix wrote:

 The compiler can do more reordering in regard to barriers. For instance,
 the compiler may reorder thread local read write accross the barrier.

 This can't be done with a library solution.

I see.

-- 
/Jacob Carlborg

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 1:31 AM, Jacob Carlborg wrote:
 On 2012-11-14 10:20, Walter Bright wrote:

 Memory barriers can certainly be added using library functions.

 Is there then any real advantage of having it directly in the language?

It's not an advantage, it's a necessity.

Andrei

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 15:22, Andrei Alexandrescu wrote:

 It's not an advantage, it's a necessity.

Walter seems to indicate that there is no technical reason for "shared" 
to be part of the language. I don't know how these memory barriers work, 
that's why I'm asking. Does it need to be in the language or not?

-- 
/Jacob Carlborg

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 7:16 AM, Jacob Carlborg wrote:
 On 2012-11-14 15:22, Andrei Alexandrescu wrote:

 It's not an advantage, it's a necessity.

 Walter seems to indicate that there is no technical reason for "shared"
 to be part of the language.

Walter is a self-confessed dilettante in threading. To be frank I hope 
he asks more and answers less in this thread.

 I don't know how these memory barriers work,
 that's why I'm asking. Does it need to be in the language or not?

Memory ordering must be built into the language and understood by the 
compiler.


Andrei

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 18:40, Andrei Alexandrescu wrote:

 Memory ordering must be built into the language and understood by the
 compiler.

Ok, thanks for the expatiation.

-- 
/Jacob Carlborg

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.

The compiler must understand the semantics of barriers such as e.g. it 
doesn't hoist code above an acquire barrier or below a release barrier.

Andrei

Nov 14 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei 
Alexandrescu wrote:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is 
 there a
 reason for
 having it built into the language? Can a library solution be 
 enough?

 Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as 
 e.g. it doesn't hoist code above an acquire barrier or below a 
 release barrier.

Again, this is true, but it would be a fallacy to conclude that 
compiler-inserted memory barriers for »shared« are required due 
to this (and it is »shared« we are discussing here!).

Simply having compiler intrinsics for atomic loads/stores is 
enough, which is hardly »built into the language«.

David

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 9:15 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as e.g. it
 doesn't hoist code above an acquire barrier or below a release barrier.

 Again, this is true, but it would be a fallacy to conclude that
 compiler-inserted memory barriers for »shared« are required due to this
 (and it is »shared« we are discussing here!).

 Simply having compiler intrinsics for atomic loads/stores is enough,
 which is hardly »built into the language«.

Compiler intrinsics ====== built into the language.

Andrei

Nov 14 2012

Iain Buclaw <ibuclaw ubuntu.com> writes:

On 14 November 2012 17:50, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 On 11/14/12 9:15 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrot=


e:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?


 Memory barriers can certainly be added using library functions.


 The compiler must understand the semantics of barriers such as e.g. it
 doesn't hoist code above an acquire barrier or below a release barrier.


 Again, this is true, but it would be a fallacy to conclude that
 compiler-inserted memory barriers for =BBshared=AB are required due to t=


his
 (and it is =BBshared=AB we are discussing here!).

 Simply having compiler intrinsics for atomic loads/stores is enough,
 which is hardly =BBbuilt into the language=AB.


 Compiler intrinsics =3D=3D=3D=3D=3D=3D built into the language.

 Andrei

Not necessarily. For example, printf is a compiler intrinsic for GDC,
but it's not built into the language in the sense of the compiler
*provides* the codegen for it.  Though it is aware of what it is and
what it does, so can perform relevant optimisations around the use of
it.


Regards,
--=20
Iain Buclaw

*(p < e ? p++ : p) =3D (c & 0x0f) + '0';

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 11:21 AM, Iain Buclaw wrote:
 On 14 November 2012 17:50, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org>  wrote:
 On 11/14/12 9:15 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 14:16:57 UTC, Andrei Alexandrescu wrote:
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?


 Memory barriers can certainly be added using library functions.


 The compiler must understand the semantics of barriers such as e.g. it
 doesn't hoist code above an acquire barrier or below a release barrier.


 Again, this is true, but it would be a fallacy to conclude that
 compiler-inserted memory barriers for �shared� are required due to this
 (and it is �shared� we are discussing here!).

 Simply having compiler intrinsics for atomic loads/stores is enough,
 which is hardly �built into the language�.


 Compiler intrinsics ====== built into the language.

 Andrei

 Not necessarily. For example, printf is a compiler intrinsic for GDC,
 but it's not built into the language in the sense of the compiler
 *provides* the codegen for it.  Though it is aware of what it is and
 what it does, so can perform relevant optimisations around the use of
 it.

aware of what it is and what it does ====== built into the language.

Andrei

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

=20
 Memory barriers can certainly be added using library functions.

=20
 The compiler must understand the semantics of barriers such as e.g. it =

doesn't hoist code above an acquire barrier or below a release barrier.

That was the point of the now deprecated "volatile" statement.  I still =
don't entirely understand why it was deprecated.=

Nov 14 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 14-11-2012 21:00, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as e.g. it doesn't
hoist code above an acquire barrier or below a release barrier.

 That was the point of the now deprecated "volatile" statement.  I still don't
entirely understand why it was deprecated.

The volatile statement was too general. All relevant compiler back ends 
today only know of two kinds of volatile operations: Loads and stores. 
Volatile statements couldn't ever be properly implemented in GDC and LDC 
for example.

See also: http://prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP20

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 12:07 PM, Alex R=F8nne Petersen <alex lycus.org> =
wrote:

 On 14-11-2012 21:00, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu =


<SeeWebsiteForEmail erdani.org> wrote:
=20
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there =





a
 reason for
 having it built into the language? Can a library solution be =





enough?
=20
 Memory barriers can certainly be added using library functions.

=20
 The compiler must understand the semantics of barriers such as e.g. =



it doesn't hoist code above an acquire barrier or below a release =
barrier.
=20
 That was the point of the now deprecated "volatile" statement.  I =


still don't entirely understand why it was deprecated.
=20

=20
 The volatile statement was too general. All relevant compiler back =

ends today only know of two kinds of volatile operations: Loads and =
stores. Volatile statements couldn't ever be properly implemented in GDC =
and LDC for example.

Well, the semantics of volatile are that there's an acquire barrier =
before the statement block and a release barrier after the statement =
block.  Or for a first cut just insert a full barrier at the beginning =
and end of the block.  Either way, it should be pretty simply for a =
compiler to handle if the compiler supports mutex use.

I do like the idea of built-in load and store intrinsics only because D =
only supports x86 assembler right now.  But really, it would be just as =
easy to fan out a D template function to a bunch of C functions =
implemented in separate ASM code files.  Druntime actually had this for =
core.atomic on PPC until not too long ago.=

Nov 14 2012

=?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <alex lycus.org> writes:

On 14-11-2012 21:15, Sean Kelly wrote:
 On Nov 14, 2012, at 12:07 PM, Alex R�nne Petersen <alex lycus.org> wrote:

 On 14-11-2012 21:00, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as e.g. it doesn't
hoist code above an acquire barrier or below a release barrier.

 That was the point of the now deprecated "volatile" statement.  I still don't
entirely understand why it was deprecated.

 The volatile statement was too general. All relevant compiler back ends today
only know of two kinds of volatile operations: Loads and stores. Volatile
statements couldn't ever be properly implemented in GDC and LDC for example.

 Well, the semantics of volatile are that there's an acquire barrier before the
statement block and a release barrier after the statement block.  Or for a
first cut just insert a full barrier at the beginning and end of the block. 
Either way, it should be pretty simply for a compiler to handle if the compiler
supports mutex use.

 I do like the idea of built-in load and store intrinsics only because D only
supports x86 assembler right now.  But really, it would be just as easy to fan
out a D template function to a bunch of C functions implemented in separate ASM
code files.  Druntime actually had this for core.atomic on PPC until not too
long ago.

Well, there's not much point in that when all compilers have intrinsics 
anyway (e.g. GDC has __sync_* and __atomic_* and LDC has some intrinsics 
in ldc.intrinsics that map to certain LLVM instructions).

-- 
Alex R�nne Petersen
alex lycus.org
http://lycus.org

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei
Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as e.g. it doesn't
hoist code above an acquire barrier or below a release barrier.

 That was the point of the now deprecated "volatile" statement.  I still don't
entirely understand why it was deprecated.

Because it's better to associate volatility with data than with code.

Andrei

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 2:21 PM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei =


Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
=20
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there =





a
 reason for
 having it built into the language? Can a library solution be =





enough?
=20
 Memory barriers can certainly be added using library functions.

=20
 The compiler must understand the semantics of barriers such as e.g. =



it doesn't hoist code above an acquire barrier or below a release =
barrier.
=20
 That was the point of the now deprecated "volatile" statement.  I =


still don't entirely understand why it was deprecated.
=20
 Because it's better to associate volatility with data than with code.

Fair enough.  Though this may mean building a bunch of different forms =
of volatility into the language.  I always saw "volatile" as a library =
tool anyway, so while making it code-related was a bit weird, it was a =
sufficient tool for the job.

Nov 14 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 23:21, Andrei Alexandrescu a �crit :
 On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is there a
 reason for
 having it built into the language? Can a library solution be enough?

 Memory barriers can certainly be added using library functions.

 The compiler must understand the semantics of barriers such as e.g.
 it doesn't hoist code above an acquire barrier or below a release
 barrier.

 That was the point of the now deprecated "volatile" statement. I still
 don't entirely understand why it was deprecated.

 Because it's better to associate volatility with data than with code.

Happy to see I'm not alone on that one.

Plus, volatile and sequential consistency are 2 different beast. 
Volatile means no register promotion and no load/store reordering. It is 
required, but not sufficient for concurrency.

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 5:10 AM, deadalnix <deadalnix gmail.com> wrote:

 Le 14/11/2012 23:21, Andrei Alexandrescu a =E9crit :
 On 11/14/12 12:00 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 6:16 AM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org> wrote:
=20
 On 11/14/12 1:20 AM, Walter Bright wrote:
 On 11/13/2012 11:37 PM, Jacob Carlborg wrote:
 If the compiler should/does not add memory barriers, then is =






there a
 reason for
 having it built into the language? Can a library solution be =






enough?
=20
 Memory barriers can certainly be added using library functions.

=20
 The compiler must understand the semantics of barriers such as e.g.
 it doesn't hoist code above an acquire barrier or below a release
 barrier.

=20
 That was the point of the now deprecated "volatile" statement. I =



still
 don't entirely understand why it was deprecated.

=20
 Because it's better to associate volatility with data than with code.

=20
 Happy to see I'm not alone on that one.
=20
 Plus, volatile and sequential consistency are 2 different beast. =

Volatile means no register promotion and no load/store reordering. It is =
required, but not sufficient for concurrency.

It's sufficient for concurrency when coupled with library code that does =
the hardware-level synchronization.  In short, a program has two =
separate machines doing similar optimizations on it: the compiler and =
the CPU.  In D we can use ASM to control CPU optimizations, and in D1 we =
had "volatile" to control compiler optimizations.  "volatile" was the =
minimum required for handling the compiler portion and was easy to get =
wrong, but it used only one keyword and I suspect was relatively easy to =
implement on the compiler side as well.=

Nov 15 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/13/12 11:37 PM, Jacob Carlborg wrote:
 On 2012-11-13 23:22, Walter Bright wrote:

 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.

 If the compiler should/does not add memory barriers, then is there a
 reason for having it built into the language? Can a library solution be
 enough?

The compiler must be in this so as to not do certain reorderings.

Andrei

Nov 14 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, November 13, 2012 14:22:07 Walter Bright wrote:
 I'm just not convinced that having the compiler add memory barriers:
 
 1. will result in correctly working code, when done by programmers who have
 only an incomplete understanding of memory barriers, which would be about
 99.9% of us.
 
 2. will result in efficient code

Being able to have double-checked locking work would be valuable, and having 
memory barriers would reduce race condition weirdness when locks aren't used 
properly, so I think that it would be desirable to have memory barriers. If 
there's a major performance penalty though, that might be a reason not to do 
it. Certainly, I don't think that there's any question that adding memory 
barriers won't make it so that you don't need mutexes or synchronized blocks 
or whatnot. shared's primary benefit is in logically separating normal code 
from code that must shared data across threads and making it possible for the 
compiler to optimize based on the fact that it knows that a variable is 
thread-local.

- Jonathan M Davis

Nov 13 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and having
 memory barriers would reduce race condition weirdness when locks aren't used
 properly, so I think that it would be desirable to have memory barriers.

I'm not saying "memory barriers are bad". I'm saying that having the compiler 
blindly insert them for shared reads/writes is far from the right way to do it.

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



Andrei

Nov 14 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

I need some clarification here: By memory barrier, do you mean x86's 
mfence, sfence, and lfence? Because as Walter said, inserting those 
blindly when unnecessary can lead to terrible performance because it 
practically murders pipelining.

(And note that you can't optimize this either; since the dependencies 
memory barriers are supposed to express are subtle and not detectable by 
a compiler, the compiler would always have to insert them because it 
can't know when it would be safe not to.)

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 14 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 15:39, Alex Rønne Petersen a écrit :
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

 I need some clarification here: By memory barrier, do you mean x86's
 mfence, sfence, and lfence? Because as Walter said, inserting those
 blindly when unnecessary can lead to terrible performance because it
 practically murders pipelining.

In fact, x86 is mostly sequentially consistent due to its memory model. 
It only require an mfence when an shared store is followed by a shared load.

See : http://g.oswego.edu/dl/jmm/cookbook.html for more information on 
the barrier required on different architectures.

 (And note that you can't optimize this either; since the dependencies
 memory barriers are supposed to express are subtle and not detectable by
 a compiler, the compiler would always have to insert them because it
 can't know when it would be safe not to.)

Compiler is aware of what is thread local and what isn't. It means the 
compiler can fully optimize TL store and load (like doing register 
promotion or reorder them across shared store/load).

This have a cost, indeed, but is useful, and Walter's solution to cast 
away shared when a mutex is acquired is always available.

Nov 14 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 15:50, deadalnix wrote:
 Le 14/11/2012 15:39, Alex Rønne Petersen a écrit :
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

 I need some clarification here: By memory barrier, do you mean x86's
 mfence, sfence, and lfence? Because as Walter said, inserting those
 blindly when unnecessary can lead to terrible performance because it
 practically murders pipelining.

 In fact, x86 is mostly sequentially consistent due to its memory model.
 It only require an mfence when an shared store is followed by a shared
 load.

I just used x86's fencing instructions as an example because most people 
here are familiar with it. The problem is much, much bigger on 
architectures like ARM, MIPS, and PowerPC which are not in-order.

 See : http://g.oswego.edu/dl/jmm/cookbook.html for more information on
 the barrier required on different architectures.

 (And note that you can't optimize this either; since the dependencies
 memory barriers are supposed to express are subtle and not detectable by
 a compiler, the compiler would always have to insert them because it
 can't know when it would be safe not to.)

 Compiler is aware of what is thread local and what isn't. It means the
 compiler can fully optimize TL store and load (like doing register
 promotion or reorder them across shared store/load).

Thread-local loads and stores are not atomic and thus do not take part 
in the reordering constraints that atomic operations impose. See e.g. 
the LLVM docs for atomicrmw and atomic load/store.

 This have a cost, indeed, but is useful, and Walter's solution to cast
 away shared when a mutex is acquired is always available.

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

 I need some clarification here: By memory barrier, do you mean x86's
 mfence, sfence, and lfence?

Sorry, I was imprecise. We need to (a) define intrinsics for loading and 
storing data with high-level semantics (a short list: acquire, release, 
acquire+release, and sequentially-consistent) and THEN (b) implement the 
needed code generation appropriately for each architecture. Indeed on 
x86 there is little need to insert fence instructions, BUT there is a 
definite need for the compiler to prevent certain reorderings. That's 
why implementing shared data operations (whether implicit or explicit) 
as sheer library code is NOT possible.

 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.

I think at this point we need to develop a better understanding of 
what's going on before issuing assessments.


Andrei

Nov 14 2012

=?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <alex lycus.org> writes:

On 14-11-2012 16:08, Andrei Alexandrescu wrote:
 On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

 I need some clarification here: By memory barrier, do you mean x86's
 mfence, sfence, and lfence?

 Sorry, I was imprecise. We need to (a) define intrinsics for loading and
 storing data with high-level semantics (a short list: acquire, release,
 acquire+release, and sequentially-consistent) and THEN (b) implement the
 needed code generation appropriately for each architecture. Indeed on
 x86 there is little need to insert fence instructions, BUT there is a
 definite need for the compiler to prevent certain reorderings. That's
 why implementing shared data operations (whether implicit or explicit)
 as sheer library code is NOT possible.

Let's continue this part of the discussion in my other reply (the one 
explaining how core.atomic is implemented in the various compilers).

 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.

 I think at this point we need to develop a better understanding of
 what's going on before issuing assessments.

I dunno. On low-end architectures like ARM the out-of-order processing 
is pretty much what makes them usable at all because they don't have the 
raw power that x86 does (I even recall an ARM Holdings executive saying 
that they couldn't possibly switch to a strong memory model with an 
in-order pipeline without severely reducing the efficiency of ARM). So 
I'm just putting that out there - it's definitely worth taking into 
consideration because very few architectures are actually fully in-order 
like x86.

 Andrei

-- 
Alex Rønne Petersen
alex lycus.org
http://lycus.org

Nov 14 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei 
Alexandrescu wrote:
 Sorry, I was imprecise. We need to (a) define intrinsics for 
 loading and storing data with high-level semantics (a short 
 list: acquire, release, acquire+release, and 
 sequentially-consistent) and THEN (b) implement the needed code 
 generation appropriately for each architecture. Indeed on x86 
 there is little need to insert fence instructions, BUT there is 
 a definite need for the compiler to prevent certain 
 reorderings. That's why implementing shared data operations 
 (whether implicit or explicit) as sheer library code is NOT 
 possible.

Sorry, I didn't see this message of yours before replying (the 
perils of threaded news readers…).

You are right about the fact that we need some degree of compiler 
support for atomic instructions. My point was that is it already 
available, otherwise it would have been impossible to implement 
core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is 
used, which prohibits compiler code motion).

Thus, »we«, meaning on a language level, don't need to change 
anything about the current situations, with the possible 
exception of adding finer-grained control to 
core.atomic.MemoryOrder/mysnc [1]. It is the duty of the compiler 
writers to provide the appropriate means to implement druntime on 
their code generation infrastructure – and indeed, the 
situation in DMD could be improved, using inline asm is hitting a 
fly with a sledgehammer.

David


[1] I am not sure where the point of diminishing returns is here, 
although it might make sense to provide the same options as 
C++11. If I remember correctly, D1/Tango supported a lot more 
levels of synchronization.

Nov 14 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 17:31:07 UTC, David Nadlinger 
wrote:
 Thus, »we«, meaning on a language level, don't need to change 
 anything about the current situations, […]

Let my clarify that: We don't necessarily need to tuck on any 
extra semantics to the language other than what we currently 
have. However, what we must indeed do is clarifying/specifying 
the implicit consensus on which the current implementations are 
built. We really need a »The D Memory Model«-style document.

David

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 9:31 AM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrote:
 Sorry, I was imprecise. We need to (a) define intrinsics for loading
 and storing data with high-level semantics (a short list: acquire,
 release, acquire+release, and sequentially-consistent) and THEN (b)
 implement the needed code generation appropriately for each
 architecture. Indeed on x86 there is little need to insert fence
 instructions, BUT there is a definite need for the compiler to prevent
 certain reorderings. That's why implementing shared data operations
 (whether implicit or explicit) as sheer library code is NOT possible.

 Sorry, I didn't see this message of yours before replying (the perils of
 threaded news readers…).

 You are right about the fact that we need some degree of compiler
 support for atomic instructions. My point was that is it already
 available, otherwise it would have been impossible to implement
 core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which
 prohibits compiler code motion).

Yah, the whole point here is that we need something IN THE LANGUAGE 
DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.

THIS IS VERY IMPORTANT.

 Thus, »we«, meaning on a language level, don't need to change anything
 about the current situations, with the possible exception of adding
 finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
 duty of the compiler writers to provide the appropriate means to
 implement druntime on their code generation infrastructure – and indeed,
 the situation in DMD could be improved, using inline asm is hitting a
 fly with a sledgehammer.

That is correct. My point is that compiler implementers would follow 
some specification. That specification would contain informationt hat 
atomicLoad and atomicStore must have special properties that put them 
apart from any other functions.

 David


 [1] I am not sure where the point of diminishing returns is here,
 although it might make sense to provide the same options as C++11. If I
 remember correctly, D1/Tango supported a lot more levels of
 synchronization.

We could start with sequential consistency and then explore 
riskier/looser policies.


Andrei

Nov 14 2012

Manu <turkeyman gmail.com> writes:

On 14 November 2012 19:54, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 9:31 AM, David Nadlinger wrote:

 On Wednesday, 14 November 2012 at 15:08:35 UTC, Andrei Alexandrescu wrot=


e:
 Sorry, I was imprecise. We need to (a) define intrinsics for loading
 and storing data with high-level semantics (a short list: acquire,
 release, acquire+release, and sequentially-consistent) and THEN (b)
 implement the needed code generation appropriately for each
 architecture. Indeed on x86 there is little need to insert fence
 instructions, BUT there is a definite need for the compiler to prevent
 certain reorderings. That's why implementing shared data operations
 (whether implicit or explicit) as sheer library code is NOT possible.

 Sorry, I didn't see this message of yours before replying (the perils of
 threaded news readers=E2=80=A6).

 You are right about the fact that we need some degree of compiler
 support for atomic instructions. My point was that is it already
 available, otherwise it would have been impossible to implement
 core.atomic.{atomicLoad, atomicStore} (for DMD inline asm is used, which
 prohibits compiler code motion).

 Yah, the whole point here is that we need something IN THE LANGUAGE
 DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.

 THIS IS VERY IMPORTANT.


I won't outright disagree, but this seems VERY dangerous to me.

You need to carefully study all popular architectures, and consider that if
the language is made to depend on these primitives, and the architecture
doesn't support it, or support that particular style of implementation
(fairly likely), than D will become incompatible with a huge number of
architectures on that day.

This is a very big deal. I would be scared to see the compiler generate
intrinsic calls to atomic synchronisation primitives. It's almost like
banning architectures from the language.

The Nintendo Wii for instance, not an unpopular machine, only sold 130
million units! Does not have synchronisation instructions in the
architecture (insane, I know, but there it is. I've had to spend time
working around this in the past).
I'm sure it's not unique in this way.

People getting fancy with lock-free/atomic operations will probably wrap it
up in libraries. And they're not globally applicable, atomic memory
operations don't magically solve problems, they require very specific
structures and access patterns around them. I'm just not convinced they
should be intrinsics issued by the language. They're just not as well
standardised as 'int' or 'float'.

Side note: I still think a convenient and fairly practical solution is to
make 'shared' things 'lockable'; where you can lock()/unlock() them, and
assignment to/from shared things is valid (no casting), but a runtime
assert insists that the entity is locked whenever it is accessed.* *It's
simplistic, but it's safe, and it works with the same primitives that
already exist and are proven. Let the programmer mark the lock/unlock
moments, worry about sequencing, etc... at least for the time being. Don't
try and do it automatically (yet).
The broad use cases in D aren't yet known, but making 'shared' useful today
would be valuable.

 Thus, =C2=BBwe=C2=AB, meaning on a language level, don't need to change an=
ything
 about the current situations, with the possible exception of adding
 finer-grained control to core.atomic.MemoryOrder/mysnc [1]. It is the
 duty of the compiler writers to provide the appropriate means to
 implement druntime on their code generation infrastructure =E2=80=93 and=


 indeed,
 the situation in DMD could be improved, using inline asm is hitting a
 fly with a sledgehammer.

 That is correct. My point is that compiler implementers would follow some
 specification. That specification would contain informationt hat atomicLo=

ad
 and atomicStore must have special properties that put them apart from any
 other functions.


  David
 [1] I am not sure where the point of diminishing returns is here,
 although it might make sense to provide the same options as C++11. If I
 remember correctly, D1/Tango supported a lot more levels of
 synchronization.

 We could start with sequential consistency and then explore riskier/loose=

r
 policies.


 Andrei

Nov 15 2012

deadalnix <deadalnix gmail.com> writes:

Le 15/11/2012 10:08, Manu a écrit :
 The Nintendo Wii for instance, not an unpopular machine, only sold 130
 million units! Does not have synchronisation instructions in the
 architecture (insane, I know, but there it is. I've had to spend time
 working around this in the past).
 I'm sure it's not unique in this way.

Can you elaborate on that ?

Nov 15 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/15/12 1:08 AM, Manu wrote:
 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail erdani.org>>
 wrote:
     Yah, the whole point here is that we need something IN THE LANGUAGE
     DEFINITION about atomicLoad and atomicStore. NOT IN THE IMPLEMENTATION.

     THIS IS VERY IMPORTANT.


 I won't outright disagree, but this seems VERY dangerous to me.

 You need to carefully study all popular architectures, and consider that
 if the language is made to depend on these primitives, and the
 architecture doesn't support it, or support that particular style of
 implementation (fairly likely), than D will become incompatible with a
 huge number of architectures on that day.

All contemporary languages that are serious about concurrency support 
atomic primitives one way or another. We must too. There's no two ways 
about it.

[snip]
 Side note: I still think a convenient and fairly practical solution is
 to make 'shared' things 'lockable'; where you can lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.

This (IIUC) is conflating mutex-based synchronization with memory models 
and atomic operations. I suggest we postpone anything related to that 
for the sake of staying focused.


Andrei

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 7:17 AM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:
=20
 Side note: I still think a convenient and fairly practical solution =


is
 to make 'shared' things 'lockable'; where you can lock()/unlock() =


them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.

=20
 This (IIUC) is conflating mutex-based synchronization with memory =

models and atomic operations. I suggest we postpone anything related to =
that for the sake of staying focused.

By extension, I'd suggest postponing anything related to classes as =
well.=

Nov 15 2012

Manu <turkeyman gmail.com> writes:

On 15 November 2012 17:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:

 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>


 wrote:
     Yah, the whole point here is that we need something IN THE LANGUAGE
     DEFINITION about atomicLoad and atomicStore. NOT IN THE
 IMPLEMENTATION.

     THIS IS VERY IMPORTANT.


 I won't outright disagree, but this seems VERY dangerous to me.

 You need to carefully study all popular architectures, and consider that
 if the language is made to depend on these primitives, and the
 architecture doesn't support it, or support that particular style of
 implementation (fairly likely), than D will become incompatible with a
 huge number of architectures on that day.

 All contemporary languages that are serious about concurrency support
 atomic primitives one way or another. We must too. There's no two ways
 about it.

 [snip]

 Side note: I still think a convenient and fairly practical solution is
 to make 'shared' things 'lockable'; where you can lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), but a
 runtime assert insists that the entity is locked whenever it is
 accessed.

 This (IIUC) is conflating mutex-based synchronization with memory models
 and atomic operations. I suggest we postpone anything related to that for
 the sake of staying focused.


I'm not conflating the 2, I'm suggesting to stick with the primitives that
are already present and proven, at least for the time being.
This thread is about addressing the problem in the short term, long term
plans can simmer until they're ready, but any moves in the short term
should make use of the primitives available and known to work, ie, don't
try and weave in language level support for architectural atomic operations
until there's a thoroughly detailed plan, and it's validated against many
architectures so we know what we're losing.
Libraries can already be written to do a lot of atomic stuff, but I still
agree with the OP that shared should be addressed and made more useful in
the short term, hence my simplistic suggestion; runtime assert that a
shared object is locked when it is read/written, and consequently, lift the
cast requirement, making it compatible with templates.

Nov 16 2012

"Pragma Tix" <bizprac orange.fr> writes:

On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:
 On 15 November 2012 17:17, Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:

 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org 
 <mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>


 wrote:
     Yah, the whole point here is that we need something IN 
 THE LANGUAGE
     DEFINITION about atomicLoad and atomicStore. NOT IN THE
 IMPLEMENTATION.

     THIS IS VERY IMPORTANT.


 I won't outright disagree, but this seems VERY dangerous to 
 me.

 You need to carefully study all popular architectures, and 
 consider that
 if the language is made to depend on these primitives, and the
 architecture doesn't support it, or support that particular 
 style of
 implementation (fairly likely), than D will become 
 incompatible with a
 huge number of architectures on that day.

 All contemporary languages that are serious about concurrency 
 support
 atomic primitives one way or another. We must too. There's no 
 two ways
 about it.

 [snip]

 Side note: I still think a convenient and fairly practical 
 solution is
 to make 'shared' things 'lockable'; where you can 
 lock()/unlock() them,
 and assignment to/from shared things is valid (no casting), 
 but a
 runtime assert insists that the entity is locked whenever it 
 is
 accessed.

 This (IIUC) is conflating mutex-based synchronization with 
 memory models
 and atomic operations. I suggest we postpone anything related 
 to that for
 the sake of staying focused.


 I'm not conflating the 2, I'm suggesting to stick with the 
 primitives that
 are already present and proven, at least for the time being.
 This thread is about addressing the problem in the short term, 
 long term
 plans can simmer until they're ready, but any moves in the 
 short term
 should make use of the primitives available and known to work, 
 ie, don't
 try and weave in language level support for architectural 
 atomic operations
 until there's a thoroughly detailed plan, and it's validated 
 against many
 architectures so we know what we're losing.
 Libraries can already be written to do a lot of atomic stuff, 
 but I still
 agree with the OP that shared should be addressed and made more 
 useful in
 the short term, hence my simplistic suggestion; runtime assert 
 that a
 shared object is locked when it is read/written, and 
 consequently, lift the
 cast requirement, making it compatible with templates.

Seems to me that Soenkes's library solution went into to right 
direction

http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com

Nov 16 2012

Manu <turkeyman gmail.com> writes:

On 16 November 2012 12:09, Pragma Tix <bizprac orange.fr> wrote:

On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:

On 15 November 2012 17:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

On 11/15/12 1:08 AM, Manu wrote:
On 14 November 2012 19:54, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org <mailto:SeeWebsiteForEmail **e**
rdani.org <http://erdani.org><SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote:
Yah, the whole point here is that we need something IN THE LANGUAGE
DEFINITION about atomicLoad and atomicStore. NOT IN THE
IMPLEMENTATION.

THIS IS VERY IMPORTANT.

I won't outright disagree, but this seems VERY dangerous to me.

You need to carefully study all popular architectures, and consider that
if the language is made to depend on these primitives, and the
architecture doesn't support it, or support that particular style of
implementation (fairly likely), than D will become incompatible with a
huge number of architectures on that day.

All contemporary languages that are serious about concurrency support
atomic primitives one way or another. We must too. There's no two ways
about it.

[snip]

Side note: I still think a convenient and fairly practical solution is
to make 'shared' things 'lockable'; where you can lock()/unlock() them,
and assignment to/from shared things is valid (no casting), but a
runtime assert insists that the entity is locked whenever it is
accessed.

This (IIUC) is conflating mutex-based synchronization with memory models
and atomic operations. I suggest we postpone anything related to that for
the sake of staying focused.

I'm not conflating the 2, I'm suggesting to stick with the primitives that
are already present and proven, at least for the time being.
This thread is about addressing the problem in the short term, long term
plans can simmer until they're ready, but any moves in the short term
should make use of the primitives available and known to work, ie, don't
try and weave in language level support for architectural atomic
operations
until there's a thoroughly detailed plan, and it's validated against many
architectures so we know what we're losing.
Libraries can already be written to do a lot of atomic stuff, but I still
agree with the OP that shared should be addressed and made more useful in
the short term, hence my simplistic suggestion; runtime assert that a
shared object is locked when it is read/written, and consequently, lift
the
cast requirement, making it compatible with templates.

Seems to me that Soenkes's library solution went into to right direction

http://forum.dlang.org/post/**k831b6$1368$1 digitalmars.com<http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com>

Looks reasonable to me, also Dmitry Olshansky and luka have both made
suggestions that look good to me aswell.
I think the only problem with all these is that they don't really feel like
a feature of the language, just some template that's not yet even in the
library.
D likes to claim that it is strong on concurrency, with that in mind, I'd
expect to at least see one of these approaches polished, and probably even
nicely sugared.
That's a minimum that people will expect, it's a proven, well known pattern
that many are familiar with, and it can be done in the language right now.
Sugaring a feature like that is simply about improving clarity, and
reducing friction for users of something that D likes to advertise as being
a core feature of the language.

Nov 16 2012

"Pragma Tix" <bizprac orange.fr> writes:

On Friday, 16 November 2012 at 10:59:02 UTC, Manu wrote:
On 16 November 2012 12:09, Pragma Tix <bizprac orange.fr> wrote:

On Friday, 16 November 2012 at 09:24:22 UTC, Manu wrote:

On 15 November 2012 17:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

On 11/15/12 1:08 AM, Manu wrote:
On 14 November 2012 19:54, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **e**
rdani.org
<http://erdani.org><SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>

wrote:
Yah, the whole point here is that we need something IN
THE LANGUAGE
DEFINITION about atomicLoad and atomicStore. NOT IN THE
IMPLEMENTATION.

THIS IS VERY IMPORTANT.

I won't outright disagree, but this seems VERY dangerous to
me.

You need to carefully study all popular architectures, and
consider that
if the language is made to depend on these primitives, and
the
architecture doesn't support it, or support that particular
style of
implementation (fairly likely), than D will become
incompatible with a
huge number of architectures on that day.

All contemporary languages that are serious about
concurrency support
atomic primitives one way or another. We must too. There's
no two ways
about it.

[snip]

Side note: I still think a convenient and fairly practical
solution is
to make 'shared' things 'lockable'; where you can
lock()/unlock() them,
and assignment to/from shared things is valid (no casting),
but a
runtime assert insists that the entity is locked whenever
it is
accessed.

This (IIUC) is conflating mutex-based synchronization with
memory models
and atomic operations. I suggest we postpone anything
related to that for
the sake of staying focused.

I'm not conflating the 2, I'm suggesting to stick with the
primitives that
are already present and proven, at least for the time being.
This thread is about addressing the problem in the short
term, long term
plans can simmer until they're ready, but any moves in the
short term
should make use of the primitives available and known to
work, ie, don't
try and weave in language level support for architectural
atomic
operations
until there's a thoroughly detailed plan, and it's validated
against many
architectures so we know what we're losing.
Libraries can already be written to do a lot of atomic stuff,
but I still
agree with the OP that shared should be addressed and made
more useful in
the short term, hence my simplistic suggestion; runtime
assert that a
shared object is locked when it is read/written, and
consequently, lift
the
cast requirement, making it compatible with templates.

Seems to me that Soenkes's library solution went into to right
direction

http://forum.dlang.org/post/**k831b6$1368$1 digitalmars.com<http://forum.dlang.org/post/k831b6$1368$1 digitalmars.com>

Looks reasonable to me, also Dmitry Olshansky and luka have
both made
suggestions that look good to me aswell.
I think the only problem with all these is that they don't
really feel like
a feature of the language, just some template that's not yet
even in the
library.
D likes to claim that it is strong on concurrency, with that in
mind, I'd
expect to at least see one of these approaches polished, and
probably even
nicely sugared.
That's a minimum that people will expect, it's a proven, well
known pattern
that many are familiar with, and it can be done in the language
right now.
Sugaring a feature like that is simply about improving clarity,
and
reducing friction for users of something that D likes to
advertise as being
a core feature of the language.

Hi Manu,
point taken. But Dimitry and Luka just made suggestions. Soenke
offers something concrete. (working right NOW) I am afraid that
we'll end up in a situation similar to the std.collections opera.
Just bla bla, and zero results. (And the collection situation
isn't solved since the very beginning of D, not to talk about
immutable collections)

Probably not En Vogue : For me Transactional Memory Management
makes sense.

Nov 16 2012

Manu <turkeyman gmail.com> writes:

On 15 November 2012 17:17, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/15/12 1:08 AM, Manu wrote:

 On 14 November 2012 19:54, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org
<mailto:SeeWebsiteForEmail **erdani.org<SeeWebsiteForEmail erdani.org>


 wrote:
     Yah, the whole point here is that we need something IN THE LANGUAGE
     DEFINITION about atomicLoad and atomicStore. NOT IN THE
 IMPLEMENTATION.

     THIS IS VERY IMPORTANT.


 I won't outright disagree, but this seems VERY dangerous to me.

 You need to carefully study all popular architectures, and consider that
 if the language is made to depend on these primitives, and the
 architecture doesn't support it, or support that particular style of
 implementation (fairly likely), than D will become incompatible with a
 huge number of architectures on that day.

 All contemporary languages that are serious about concurrency support
 atomic primitives one way or another. We must too. There's no two ways
 about it.

I can't resist... D may be serious about the *idea* of concurrency, but it
clearly isn't serious about concurrency yet. shared is a prime example of
that.
We do support atomic primitives 'one way or another'; there are intrinsics
on all compilers. Libraries can use them.
Again, this thread seemed to be about urgent action... D needs a LOT of
work on it's concurrency model, but something of an urgent fix to make a
key language feature more useful needs to leverage what's there now.

Nov 16 2012

"David Nadlinger" <see klickverbot.at> writes:

On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei 
Alexandrescu wrote:
 That is correct. My point is that compiler implementers would 
 follow some specification. That specification would contain 
 informationt hat atomicLoad and atomicStore must have special 
 properties that put them apart from any other functions.

What are these special properties? Sorry, it seems like we are 
talking past each other…

 [1] I am not sure where the point of diminishing returns is 
 here,
 although it might make sense to provide the same options as 
 C++11. If I
 remember correctly, D1/Tango supported a lot more levels of
 synchronization.

 We could start with sequential consistency and then explore 
 riskier/looser policies.

I'm not quite sure what you are saying here. The functions in 
core.atomic already exist, and currently offer four levels (raw, 
acq, rel, seq). Are you suggesting to remove the other options?

David

Nov 15 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.

 What are these special properties? Sorry, it seems like we are talking
 past each other…

For example you can't hoist a memory operation before a shared load or 
after a shared store.

Andrei

Nov 15 2012

"David Nadlinger" <see klickverbot.at> writes:

On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei 
Alexandrescu wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei 
 Alexandrescu wrote:
 That is correct. My point is that compiler implementers would 
 follow
 some specification. That specification would contain 
 informationt hat
 atomicLoad and atomicStore must have special properties that 
 put them
 apart from any other functions.

 What are these special properties? Sorry, it seems like we are 
 talking
 past each other…

 For example you can't hoist a memory operation before a shared 
 load or after a shared store.

Well, to be picky, that depends on what kind of memory operation 
you mean – moving non-volatile loads/stores across volatile 
ones is typically considered acceptable.

But still, you can't move memory operations across any other 
arbitrary function call either (unless you can prove it is safe 
by inspecting the callee's body, obviously), so I don't see where 
atomicLoad/atomicStore would be special here.

David

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:

 On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu =

wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu =



wrote:
 That is correct. My point is that compiler implementers would =




follow
 some specification. That specification would contain informationt =




hat
 atomicLoad and atomicStore must have special properties that put =




them
 apart from any other functions.

=20
 What are these special properties? Sorry, it seems like we are =



talking
 past each other=85

=20
 For example you can't hoist a memory operation before a shared load =


or after a shared store.
=20
 Well, to be picky, that depends on what kind of memory operation you =

mean =96 moving non-volatile loads/stores across volatile ones is =
typically considered acceptable.

Usually not, really.  Like if you implement a mutex, you don't want =
non-volatile operations to be hoisted above the mutex acquire or sunk =
below the mutex release.  However, it's safe to move additional =
operations into the block where the mutex is held.=

Nov 15 2012

"David Nadlinger" <see klickverbot.at> writes:

On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 3:05 PM, David Nadlinger 
 <see klickverbot.at> wrote:
 Well, to be picky, that depends on what kind of memory 
 operation you mean – moving non-volatile loads/stores across 
 volatile ones is typically considered acceptable.

 Usually not, really.  Like if you implement a mutex, you don't 
 want non-volatile operations to be hoisted above the mutex 
 acquire or sunk below the mutex release.  However, it's safe to 
 move additional operations into the block where the mutex is 
 held.

Oh well, I was just being stupid when typing up my response: What 
I meant to say is that you _can_ reorder a set of memory 
operations involving atomic/volatile ones unless you violate the 
guarantees of the chosen memory order option.

So, for Andrei's statement to be true, shared needs to be defined 
as making all memory operations sequentially consistent. Walter 
doesn't seem to think this is the way to go, at least if that is 
what he is referring to as »memory barriers«.

David

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 3:30 PM, David Nadlinger <see klickverbot.at> wrote:

 On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> =


wrote:
 Well, to be picky, that depends on what kind of memory operation you =



mean =96 moving non-volatile loads/stores across volatile ones is =
typically considered acceptable.
=20
 Usually not, really.  Like if you implement a mutex, you don't want =


non-volatile operations to be hoisted above the mutex acquire or sunk =
below the mutex release.  However, it's safe to move additional =
operations into the block where the mutex is held.
=20
 Oh well, I was just being stupid when typing up my response: What I =

meant to say is that you _can_ reorder a set of memory operations =
involving atomic/volatile ones unless you violate the guarantees of the =
chosen memory order option.
=20
 So, for Andrei's statement to be true, shared needs to be defined as =

making all memory operations sequentially consistent. Walter doesn't =
seem to think this is the way to go, at least if that is what he is =
referring to as =BBmemory barriers=AB.

I think because of the as-if rule, the compiler can continue to optimize =
all it wants between volatile operations.  Just not across them.=

Nov 15 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/15/12 3:30 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 23:22:32 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 3:05 PM, David Nadlinger <see klickverbot.at> wrote:
 Well, to be picky, that depends on what kind of memory operation you
 mean – moving non-volatile loads/stores across volatile ones is
 typically considered acceptable.

 Usually not, really. Like if you implement a mutex, you don't want
 non-volatile operations to be hoisted above the mutex acquire or sunk
 below the mutex release. However, it's safe to move additional
 operations into the block where the mutex is held.

 Oh well, I was just being stupid when typing up my response: What I
 meant to say is that you _can_ reorder a set of memory operations
 involving atomic/volatile ones unless you violate the guarantees of the
 chosen memory order option.

 So, for Andrei's statement to be true, shared needs to be defined as
 making all memory operations sequentially consistent. Walter doesn't
 seem to think this is the way to go, at least if that is what he is
 referring to as »memory barriers«.

Shared must be sequentially consistent.

Andrei

Nov 15 2012

deadalnix <deadalnix gmail.com> writes:

Le 15/11/2012 15:22, Sean Kelly a �crit :
 On Nov 15, 2012, at 3:05 PM, David Nadlinger<see klickverbot.at>  wrote:

 On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.

 What are these special properties? Sorry, it seems like we are talking
 past each other�

 For example you can't hoist a memory operation before a shared load or after a
shared store.

 Well, to be picky, that depends on what kind of memory operation you mean �
moving non-volatile loads/stores across volatile ones is typically considered
acceptable.

 Usually not, really.  Like if you implement a mutex, you don't want
non-volatile operations to be hoisted above the mutex acquire or sunk below the
mutex release.  However, it's safe to move additional operations into the block
where the mutex is held.

If it is known that the memory read/write is thread local, this is safe, 
even in the case of a mutex.

Nov 18 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/15/12 3:05 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 22:57:54 UTC, Andrei Alexandrescu wrote:
 On 11/15/12 1:29 PM, David Nadlinger wrote:
 On Wednesday, 14 November 2012 at 17:54:16 UTC, Andrei Alexandrescu
 wrote:
 That is correct. My point is that compiler implementers would follow
 some specification. That specification would contain informationt hat
 atomicLoad and atomicStore must have special properties that put them
 apart from any other functions.

 What are these special properties? Sorry, it seems like we are talking
 past each other…

 For example you can't hoist a memory operation before a shared load or
 after a shared store.

 Well, to be picky, that depends on what kind of memory operation you
 mean – moving non-volatile loads/stores across volatile ones is
 typically considered acceptable.

In D that's fine (as long as in-thread SC is respected) because 
non-shared vars are guaranteed to be thread-local.

 But still, you can't move memory operations across any other arbitrary
 function call either (unless you can prove it is safe by inspecting the
 callee's body, obviously), so I don't see where atomicLoad/atomicStore
 would be special here.

It is special because e.g. on x86 the function is often a simple 
unprotected load or store. So after the inliner has at it, there's 
nothing to stay in the way of reordering. The point is the compiler must 
understand the semantics of acquire and release.


Andrei

Nov 15 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:
 On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

 I need some clarification here: By memory barrier, do you mean x86's
 mfence, sfence, and lfence?

 Sorry, I was imprecise. We need to (a) define intrinsics for loading and
storing
 data with high-level semantics (a short list: acquire, release,
acquire+release,
 and sequentially-consistent) and THEN (b) implement the needed code generation
 appropriately for each architecture. Indeed on x86 there is little need to
 insert fence instructions, BUT there is a definite need for the compiler to
 prevent certain reorderings. That's why implementing shared data operations
 (whether implicit or explicit) as sheer library code is NOT possible.

 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.

 I think at this point we need to develop a better understanding of what's going
 on before issuing assessments.

Yes. And also, I agree that having something typed as "shared" must prevent the 
compiler from reordering them. But that's separate from inserting memory
barriers.

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared" must
 prevent the compiler from reordering them. But that's separate from
 inserting memory barriers.

It's the same issue at hand: ordering properly and inserting barriers 
are two ways to ensure one single goal, sequential consistency. Same thing.

Andrei

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 2:25 PM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared" must
 prevent the compiler from reordering them. But that's separate from
 inserting memory barriers.

=20
 It's the same issue at hand: ordering properly and inserting barriers =

are two ways to ensure one single goal, sequential consistency. Same =
thing.

Sequential consistency is great and all, but it doesn't render =
concurrent code correct.  At worst, it provides a false sense of =
security that somehow it does accomplish this, and people end up =
actually using it as such.=

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 4:50 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 2:25 PM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:

 On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared"
 must prevent the compiler from reordering them. But that's
 separate from inserting memory barriers.

 It's the same issue at hand: ordering properly and inserting
 barriers are two ways to ensure one single goal, sequential
 consistency. Same thing.

 Sequential consistency is great and all, but it doesn't render
 concurrent code correct.  At worst, it provides a false sense of
 security that somehow it does accomplish this, and people end up
 actually using it as such.

Yah, but the baseline here is acquire-release which has subtle 
differences that are all the more maddening.

Andrei

Nov 14 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 14, 2012, at 6:28 PM, Andrei Alexandrescu =
<SeeWebsiteForEmail erdani.org> wrote:

 On 11/14/12 4:50 PM, Sean Kelly wrote:
 On Nov 14, 2012, at 2:25 PM, Andrei
 Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
=20
 On 11/14/12 1:09 PM, Walter Bright wrote:
 Yes. And also, I agree that having something typed as "shared"
 must prevent the compiler from reordering them. But that's
 separate from inserting memory barriers.

=20
 It's the same issue at hand: ordering properly and inserting
 barriers are two ways to ensure one single goal, sequential
 consistency. Same thing.

=20
 Sequential consistency is great and all, but it doesn't render
 concurrent code correct.  At worst, it provides a false sense of
 security that somehow it does accomplish this, and people end up
 actually using it as such.

=20
 Yah, but the baseline here is acquire-release which has subtle =

differences that are all the more maddening.

Really?  Acquire-release always seemed to have equivalent safety to me.  =
Typically, the user doesn't even have to understand that optimization =
can occur upwards across the trailing boundary of the block, etc, to =
produce correct code.  Though I do agree that the industry is moving =
towards sequential consistency, so there may be no point in trying for =
something weaker.=

Nov 15 2012

deadalnix <deadalnix gmail.com> writes:

Le 14/11/2012 22:09, Walter Bright a écrit :
 On 11/14/2012 7:08 AM, Andrei Alexandrescu wrote:
 On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
 On 14-11-2012 15:14, Andrei Alexandrescu wrote:
 On 11/14/12 1:19 AM, Walter Bright wrote:
 On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
 Being able to have double-checked locking work would be valuable, and
 having
 memory barriers would reduce race condition weirdness when locks
 aren't used
 properly, so I think that it would be desirable to have memory
 barriers.

 I'm not saying "memory barriers are bad". I'm saying that having the
 compiler blindly insert them for shared reads/writes is far from the
 right way to do it.



 Andrei

 I need some clarification here: By memory barrier, do you mean x86's
 mfence, sfence, and lfence?

 Sorry, I was imprecise. We need to (a) define intrinsics for loading
 and storing
 data with high-level semantics (a short list: acquire, release,
 acquire+release,
 and sequentially-consistent) and THEN (b) implement the needed code
 generation
 appropriately for each architecture. Indeed on x86 there is little
 need to
 insert fence instructions, BUT there is a definite need for the
 compiler to
 prevent certain reorderings. That's why implementing shared data
 operations
 (whether implicit or explicit) as sheer library code is NOT possible.

 Because as Walter said, inserting those blindly when unnecessary can
 lead to terrible performance because it practically murders
 pipelining.

 I think at this point we need to develop a better understanding of
 what's going
 on before issuing assessments.

 Yes. And also, I agree that having something typed as "shared" must
 prevent the compiler from reordering them. But that's separate from
 inserting memory barriers.

I'm sorry but that is dumb.

What is the point of ensuring that the compiler does not reorder 
load/stores if the CPU is allowed to do so ?

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:
=20
 What is the point of ensuring that the compiler does not reorder =

load/stores if the CPU is allowed to do so ?

Because we can write ASM to tell the CPU not to.  We don't have any such =
ability for the compiler right now.=

Nov 15 2012

"David Nadlinger" <see klickverbot.at> writes:

On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> 
 wrote:
 
 What is the point of ensuring that the compiler does not 
 reorder load/stores if the CPU is allowed to do so ?

 Because we can write ASM to tell the CPU not to.  We don't have 
 any such ability for the compiler right now.

I think the question was: Why would you want to disable compiler 
code motion for loads/stores which are not atomic, as the CPU 
might ruin your assumptions anyway?

David

Nov 15 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/15/12 2:18 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:
 What is the point of ensuring that the compiler does not reorder
 load/stores if the CPU is allowed to do so ?

 Because we can write ASM to tell the CPU not to. We don't have any
 such ability for the compiler right now.

 I think the question was: Why would you want to disable compiler code
 motion for loads/stores which are not atomic, as the CPU might ruin your
 assumptions anyway?

The compiler does whatever it takes to ensure sequential consistency for 
shared use, including possibly inserting fences in certain places.

Andrei

Nov 15 2012

"David Nadlinger" <see klickverbot.at> writes:

On Thursday, 15 November 2012 at 22:58:53 UTC, Andrei 
Alexandrescu wrote:
 On 11/15/12 2:18 PM, David Nadlinger wrote:
 On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly 
 wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> 
 wrote:
 What is the point of ensuring that the compiler does not 
 reorder
 load/stores if the CPU is allowed to do so ?

 Because we can write ASM to tell the CPU not to. We don't 
 have any
 such ability for the compiler right now.

 I think the question was: Why would you want to disable 
 compiler code
 motion for loads/stores which are not atomic, as the CPU might 
 ruin your
 assumptions anyway?

 The compiler does whatever it takes to ensure sequential 
 consistency for shared use, including possibly inserting fences 
 in certain places.

 Andrei

How does this have anything to do with deadalnix' question that I 
rephrased at all? It is not at all clear that shared should do 
this (it currently doesn't), and the question was explicitly 
about Walter's statement that shared should disable compiler 
reordering, when at the same time *not* inserting barriers/atomic 
ops. Thus the »which are not atomic« qualifier in my message.

David

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 2:18 PM, David Nadlinger <see klickverbot.at> wrote:

 On Thursday, 15 November 2012 at 16:43:14 UTC, Sean Kelly wrote:
 On Nov 15, 2012, at 5:16 AM, deadalnix <deadalnix gmail.com> wrote:
 What is the point of ensuring that the compiler does not reorder =



load/stores if the CPU is allowed to do so ?
=20
 Because we can write ASM to tell the CPU not to.  We don't have any =


such ability for the compiler right now.
=20
 I think the question was: Why would you want to disable compiler code =

motion for loads/stores which are not atomic, as the CPU might ruin your =
assumptions anyway?

A barrier isn't always necessary to achieve the desired ordering on a =
given system.  But I'd still call out to ASM to make sure the intended =
operation happened.  I don't know that I'd ever feel comfortable with =
"volatile x=3Dy" even if what I'd do instead is just a MOV.=

Nov 15 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 08:56, Jonathan M Davis wrote:

 Being able to have double-checked locking work would be valuable, and having
 memory barriers would reduce race condition weirdness when locks aren't used
 properly, so I think that it would be desirable to have memory barriers. If
 there's a major performance penalty though, that might be a reason not to do
 it. Certainly, I don't think that there's any question that adding memory
 barriers won't make it so that you don't need mutexes or synchronized blocks
 or whatnot. shared's primary benefit is in logically separating normal code
 from code that must shared data across threads and making it possible for the
 compiler to optimize based on the fact that it knows that a variable is
 thread-local.

If there is a problem with efficiency in some cases then the developer 
can use __gshared and manually handling things. But of course, we don't 
want the developer to have to do this in most cases.

-- 
/Jacob Carlborg

Nov 14 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 13.11.2012 23:22, schrieb Walter Bright:
 But I do see enormous value in shared in that it logically (and rather
 forcefully) separates thread-local code from multi-thread code. For
 example, see the post here about adding a destructor to a shared struct,
 and having it fail to compile. The complaint was along the lines of
 shared being broken, whereas I viewed it along the lines of shared
 pointing out a logic problem in the code - what does destroying a struct
 accessible from multiple threads mean? I think it must be clear that
 destroying an object can only happen in one thread, i.e. the object must
 become thread local in order to be destroyed.

I still don't agree with you there. The struct would have clearly 
outlived any thread (as it was in the global scope) so at the point 
where it is destroyed there should be really only one thread left. So it 
IS destroyed in a single threaded context. The same is done for classes 
by the GC just that the GC ignores shared altogether.

Kind Regards
Benjamin Thaut

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/14/2012 1:01 AM, Benjamin Thaut wrote:
 I still don't agree with you there. The struct would have clearly outlived any
 thread (as it was in the global scope) so at the point where it is destroyed
 there should be really only one thread left. So it IS destroyed in a single
 threaded context.

If you know this for a fact, then cast it to thread local. The compiler cannot 
figure this out for you, hence it issues the error.


 The same is done for classes by the GC just that the GC
 ignores shared altogether.

That's different, because the GC verifies that there are *no* references to it 
from any thread first.

Nov 14 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.11.2012 10:18, schrieb Walter Bright:
 On 11/14/2012 1:01 AM, Benjamin Thaut wrote:
 I still don't agree with you there. The struct would have clearly
 outlived any
 thread (as it was in the global scope) so at the point where it is
 destroyed
 there should be really only one thread left. So it IS destroyed in a
 single
 threaded context.

 If you know this for a fact, then cast it to thread local. The compiler
 cannot figure this out for you, hence it issues the error.


 The same is done for classes by the GC just that the GC
 ignores shared altogether.

 That's different, because the GC verifies that there are *no* references
 to it from any thread first.

Could you please give an example where it would break?

And whats the difference between:

struct Value
{
   ~this()
   {
     printf("destroy\n");
   }
}

shared Value v;


and:


shared static ~this()
{
   printf("destory\n");
}

Kind Regards
Benjamin Thaut

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?

Thread 1:
   1. create shared object
   2. pass reference to that object to Thread 2
   3. destroy object

Thread 2:
   1. manipulate that object


 And whats the difference between:

 struct Value
 {
    ~this()
    {
      printf("destroy\n");
    }
 }

 shared Value v;


 and:


 shared static ~this()
 {
    printf("destory\n");
 }

The struct declaration of ~this() has no idea what context it will be used in.

Nov 14 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?

 Thread 1:
    1. create shared object
    2. pass reference to that object to Thread 2
    3. destroy object

 Thread 2:
    1. manipulate that object

But for passing a reference to a value type you would have to use a 
pointer, correct? And pointers are a unsafe feature anyway...
I don't see your point.

And if the use of pointers is allowed, I can make the same case break in 
a single threaded environment without shared.

Kind Regards
Benjamin Thaut

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/14/2012 2:49 AM, Benjamin Thaut wrote:
 Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?

 Thread 1:
    1. create shared object
    2. pass reference to that object to Thread 2
    3. destroy object

 Thread 2:
    1. manipulate that object

 But for passing a reference to a value type you would have to use a pointer,
 correct? And pointers are a unsafe feature anyway...
 I don't see your point.

Pointers are safe. It's pointer arithmetic that is not (and escaping pointers).


 And if the use of pointers is allowed, I can make the same case break in a
 single threaded environment without shared.

1. You can't escape pointers in safe code (well, it's a bug if you do).

2. If the struct is on the heap, it is only destructed if there are no 
references to it in any thread. If it is not on the heap, and you are in safe 
code, it should always be destructed safely when it goes out of scope.

This is not so for shared pointers.

Nov 14 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 14.11.2012 12:00, schrieb Walter Bright:
 On 11/14/2012 2:49 AM, Benjamin Thaut wrote:
 Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?

 Thread 1:
    1. create shared object
    2. pass reference to that object to Thread 2
    3. destroy object

 Thread 2:
    1. manipulate that object

 But for passing a reference to a value type you would have to use a
 pointer,
 correct? And pointers are a unsafe feature anyway...
 I don't see your point.

 Pointers are safe. It's pointer arithmetic that is not (and escaping
 pointers).


 And if the use of pointers is allowed, I can make the same case break
 in a
 single threaded environment without shared.

 1. You can't escape pointers in safe code (well, it's a bug if you do).

 2. If the struct is on the heap, it is only destructed if there are no
 references to it in any thread. If it is not on the heap, and you are in
 safe code, it should always be destructed safely when it goes out of scope.

 This is not so for shared pointers.

So just to be clear, escaping pointers in a single threaded context is a 
bug. But if you escape them in a multithreaded context its ok?
That sounds inconsistent to me.

But if that is by design your argument is valid.

I still can not think of any real world usecase though where this could 
actually be used.

A small code example which would break as soon as we allow destructing 
of shared value types would really be nice. (maybe even in the langauge 
documentation, because I coudln't find anything)

Kind Regards
Benjamin Thaut

Nov 14 2012

Walter Bright <newshound2 digitalmars.com> writes:

On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
 A small code example which would break as soon as we allow destructing of
shared
 value types would really be nice.

I hate to repeat myself, but:

Thread 1:
     1. create shared object
     2. pass reference to that object to Thread 2
     3. destroy object

Thread 2:
     1. manipulate that object

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 1:06 PM, Walter Bright wrote:
 On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
 A small code example which would break as soon as we allow destructing
 of shared
 value types would really be nice.

 I hate to repeat myself, but:

 Thread 1:
 1. create shared object
 2. pass reference to that object to Thread 2

That should be disallowed at least in safe code. If I had my way I'd 
explore disallowing in all code.

Andrei

Nov 14 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-14 22:06, Walter Bright wrote:

 I hate to repeat myself, but:

 Thread 1:
      1. create shared object
      2. pass reference to that object to Thread 2
      3. destroy object

 Thread 2:
      1. manipulate that object

Why would the object be destroyed if there's still a reference to it? If 
the object is manually destroyed I don't see what threads have to do 
with it since you can do the same thing in a single thread application.

-- 
/Jacob Carlborg

Nov 15 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, November 15, 2012 10:22:22 Jacob Carlborg wrote:
 On 2012-11-14 22:06, Walter Bright wrote:
 I hate to repeat myself, but:
 
 Thread 1:
      1. create shared object
      2. pass reference to that object to Thread 2
      3. destroy object
 
 Thread 2:
      1. manipulate that object

 
 Why would the object be destroyed if there's still a reference to it? If
 the object is manually destroyed I don't see what threads have to do
 with it since you can do the same thing in a single thread application.

Yeah. If the reference passed across were shared, then the runtime should see 
it as having multiple references, and if it's _not_ shared, that means that 
you cast shared away (unsafe, since it's a cast) and passed it across threads 
without making sure that it was the only reference on the original thread. In 
that case, you shot yourself in the foot by using an  system construct 
(casting) and not getting it right. I don't see why the runtime would have to 
worry about that.

Unless the problem is that the object is a value type, so when it goes away on 
the first thread, it _has_ to be destroyed? If that's the case, then it's a 
pointer that was passed across rather than a reference, and then you've 
effectively done the same thing as returning a pointer to a local variable, 
which is  system and again only happens if you're getting  system wrong, which 
the compiler generally doesn't protect you from beyond giving you an error in 
the few cases where it can determine for certain that what you're doing is 
wrong (which is a fairly limited portion of the time).

So, as far as I can see - unless I'm just totally missing something here - 
either you're dealing with shared objects on the heap here, in which case, the 
object shouldn't be destroyed on the first thread unless you do it manually (in 
which case, you're doing something stupid in  system code), or you're dealing 
with passing pointers to shared value types across threads, which is 
essentially the equivalent of escaping a pointer to a local variable (in which 
case, you're doing something stupid in  system code). In either case, it's 
you're doing something stupid in  system code, and I don't see why the runtime 
would have to worry about it. You shot yourself in the foot by incorrectly 
using  system code. If you want protection agains that, then don't use  system 
code.

- Jonathan M Davis

Nov 15 2012

Benjamin Thaut <code benjamin-thaut.de> writes:

Am 15.11.2012 12:48, schrieb Jonathan M Davis:
 Yeah. If the reference passed across were shared, then the runtime should see
 it as having multiple references, and if it's _not_ shared, that means that
 you cast shared away (unsafe, since it's a cast) and passed it across threads
 without making sure that it was the only reference on the original thread. In
 that case, you shot yourself in the foot by using an  system construct
 (casting) and not getting it right. I don't see why the runtime would have to
 worry about that.

 Unless the problem is that the object is a value type, so when it goes away on
 the first thread, it _has_ to be destroyed? If that's the case, then it's a
 pointer that was passed across rather than a reference, and then you've
 effectively done the same thing as returning a pointer to a local variable,
 which is  system and again only happens if you're getting  system wrong, which
 the compiler generally doesn't protect you from beyond giving you an error in
 the few cases where it can determine for certain that what you're doing is
 wrong (which is a fairly limited portion of the time).

 So, as far as I can see - unless I'm just totally missing something here -
 either you're dealing with shared objects on the heap here, in which case, the
 object shouldn't be destroyed on the first thread unless you do it manually (in
 which case, you're doing something stupid in  system code), or you're dealing
 with passing pointers to shared value types across threads, which is
 essentially the equivalent of escaping a pointer to a local variable (in which
 case, you're doing something stupid in  system code). In either case, it's
 you're doing something stupid in  system code, and I don't see why the runtime
 would have to worry about it. You shot yourself in the foot by incorrectly
 using  system code. If you want protection agains that, then don't use  system
 code.

 - Jonathan M Davis

Thank you, thats exatcly how I'm thinking too. And because of this it 
makes absolutley no sense to me to disallow the destruction of a shared 
struct, if it is allocated on the stack or as a global. If it is 
allocated on the heap you can't destory it manually anyway because 
delete is deprecated.

And for exatcly this reason I wanted a code example from Walter. Because 
just listing a few bullet points does not make a real world use case.

Kind Regards
Benjamin Thaut

Nov 15 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11/15/2012 1:06 AM, Walter Bright пишет:
 On 11/14/2012 3:14 AM, Benjamin Thaut wrote:
 A small code example which would break as soon as we allow destructing
 of shared
 value types would really be nice.

 I hate to repeat myself, but:

 Thread 1:
      1. create shared object
      2. pass reference to that object to Thread 2
      3. destroy object

 Thread 2:
      1. manipulate that object

Ain't structs typically copied anyway?

Reference would imply pointer then. If the struct is on the stack (weird 
but could be) then the thread that created it destroys the object once. 
The thing is as unsafe as escaping a pointer is.

Personally I think that shared stuff allocated on the stack is 
here-be-dragons  system code in any case.

Otherwise it's GC's responsibility to destroy heap allocated struct when 
there are no references to it.

What's so puzzling about it?

BTW currently GC-allocated structs are not having their destructor 
called at all. The bug is however _minor_ ...

http://d.puremagic.com/issues/show_bug.cgi?id=2834

-- 
Dmitry Olshansky

Nov 15 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, November 14, 2012 11:49:22 Benjamin Thaut wrote:
 Am 14.11.2012 11:42, schrieb Walter Bright:
 On 11/14/2012 1:23 AM, Benjamin Thaut wrote:
 Could you please give an example where it would break?

 
 Thread 1:
    1. create shared object
    2. pass reference to that object to Thread 2
    3. destroy object
 
 Thread 2:
    1. manipulate that object

 
 But for passing a reference to a value type you would have to use a
 pointer, correct? And pointers are a unsafe feature anyway...
 I don't see your point.
 
 And if the use of pointers is allowed, I can make the same case break in
 a single threaded environment without shared.

Pointers are not considered unsafe at all and are perfectly legal in SafeD. 
It's ponter _arithmetic_ which is unsafe and therefore considered to be 
 system.

- Jonathan M Davis

Nov 14 2012

"Jason House" <jason.james.house gmail.com> writes:

On Monday, 12 November 2012 at 02:31:05 UTC, Walter Bright wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

This is a fairly reasonable use of shared, but it is bypassing 
the type system. Once shared is cast away, it is free to be mixed 
with thread local variables. Pieces can be assigned to non-shared 
globals, impure functions can stash reference, weakly pure 
functions can mix their arguments together, etc... If locking 
converts shared(T) to bikeshed(T), I bet some of safeD's logic 
for no escaping references could be used to improve things.

It's also interesting to note that casting away shared after 
taking a lock implicitly means that everything was transitively 
owned by that lock. I wonder how well a library could 
promote/enforce such a thing?

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/11/12 6:30 PM, Walter Bright wrote:
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

This is very different from how I view we should do things (and how we 
actually agreed to do things and how I wrote in TDPL).

I can't believe I need to restart this on a cold cache.


Andrei

Nov 14 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, November 14, 2012 18:30:56 Andrei Alexandrescu wrote:
 On 11/11/12 6:30 PM, Walter Bright wrote:
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 
 This is very different from how I view we should do things (and how we
 actually agreed to do things and how I wrote in TDPL).
 
 I can't believe I need to restart this on a cold cache.

Well, this is clearly how things work now, and if you want to use shared with 
much of anything, it's how things generally have to work, because almost 
nothing takes shared. Templated stuff will at least some of the time (though 
it's often untested for it and probably will get screwed by Unqual in quite a 
few cases), but there's no way aside from templates or casting to get shared 
variables to share the same functions as non-shared ones, leading to code 
duplication.

From what I recall of what TDPL says, this doesn't really contradict it. It's 

just that TDPL doesn't really say much about the fact that almost nothing will 
work with shared, which means that casting is necessary.

I have no idea what we want to do about this situation though. Regardless of 
what we do with memory barriers and the like, it has no impact on whether 
casts are required. And I think that introducing the shared equivalent of 
const would be a huge mistake, because then most code would end up being 
written using that attribute, meaning that all code essentially has to be 
treated as shared from the standpoint of compiler optimizations. It would 
almost be the same as making everything shared by default again. So, as far as 
I can see, casting is what we're forced to do.

- Jonathan M Davis

Nov 14 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg gmx.com> said:

 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required.

One thing I'm confused about right now is how people are using shared. 
If you're using shared with atomic operations, then you need barriers 
when accessing or mutating the variable. If you're using shared with 
mutexes, spin-locks, etc., you don't care about the barriers. But you 
can't use it with both at the same time. So which of these shared 
stands for?

In both of these cases, there's an implicit policy for accessing or 
mutating the variable. I think the language need some way to express 
that policy. I suggested some time ago a way to protect variables with 
mutexes so that the compiler can actually help you use those mutexes 
correctly[1]. The idea was to associate a mutex to the variable 
declaration. This could be extended to support an atomic access policy.

Let me restate and extend that idea to atomic operations. Declare a 
variable using the synchronized storage class and it automatically get 
a mutex:

	synchronized int i; // declaration

	i++; // error, variable shared

	synchronized (i)
		i++; // fine, variable is thread-local inside synchronized block

Synchronized here is some kind of storage class causing two things: a 
mutex is attached to the variable declaration, and the type of the 
variable is made shared. The variable being shared, you can't access it 
directly. But a synchronized statement will make the variable 
non-shared within its bounds.

Now, if you want a custom mutex class, write it like this:

	synchronized(SpinLock) int i;

	synchronized(i)
	{
		// implicit: i.mutexof.lock();
		// implicit: scope (exit) i.mutexof.unlock();
		i++;
	}

If you want to declare the mutex separately, you could do it by 
specifying a variable instead of a type in the variable declaration:

	Mutex m;
	synchronized(m) int i;
	
	synchronized(i)
	{
		// implicit: m.lock();
		// implicit: scope (exit) m.unlock();
		i++;
	}

Also, if you have a read-write mutex and only need read access, you 
could declare that you only need read access using const:

	synchronized(RWMutex) int i;

	synchronized(const i)
	{
		// implicit: i.mutexof.constLock();
		// implicit: scope (exit) i.mutexof.constUnlock();
		i++; // error, i is const
	}

And finally, if you want to use atomic operations, declare it this way:

	synchronized(Atomic) int i;

You can't really synchronize on something protected by Atomic:

	syncronized(i) // cannot make sycnronized block, no lock/unlock method 
in Atomic
	{}

But you can call operators on it while synchronized, it works for 
anything implemented by Atomic:

	synchronized(i)++; // implicit: Atomic.opUnary!"++"(i);

Because the policy object is associated with the variable declaration, 
when locking the mutex you need direct access to the original variable, 
or an alias to it. Same for performing atomic operations. You can't 
pass a reference to some function and have that function perform the 
locking. If that's a problem it can be avoided by having a way to pass 
the mutex to the function, or by passing an alias to a template.

Okay, this syntax probably still has some problems, feel free to point 
them out. I don't really care about the syntax though. The important 
thing is that you need a way to define the policy for accessing the 
shared data in a way the compiler can actually enforce it and that 
programmers can actually reuse it.

Because right now there is no policy. Having to cast things everywhere 
is equivalent to having to redefine the policy everywhere. Same for 
having to write encapsulation types that work with shared for 
everything you want to share: each type has to implement the policy. 
There's nothing worse than constantly rewriting the sharing policies. 
Concurrency error-prone because of all the subtleties; you don't want 
to encourage people to write policies of their own every time they 
invent a new type. You need to reuse existing ones, and the compiler 
can help with that.

[1]: http://michelf.ca/blog/2012/mutex-synchonization-in-d/


-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 14 2012

"Regan Heath" <regan netmail.co.nz> writes:

On Thu, 15 Nov 2012 04:33:20 -0000, Michel Fortin  =

<michel.fortin michelf.ca> wrote:

 On 2012-11-15 02:51:13 +0000, "Jonathan M Davis" <jmdavisProg gmx.com>=

  =

 said:

 I have no idea what we want to do about this situation though.  =


 Regardless of
 what we do with memory barriers and the like, it has no impact on  =


 whether
 casts are required.

 Let me restate and extend that idea to atomic operations. Declare a  =

 variable using the synchronized storage class and it automatically get=

 a  =

 mutex:

 	synchronized int i; // declaration

 	i++; // error, variable shared

 	synchronized (i)
 		i++; // fine, variable is thread-local inside synchronized block

 Synchronized here is some kind of storage class causing two things: a =

 =

 mutex is attached to the variable declaration, and the type of the  =

 variable is made shared. The variable being shared, you can't access i=

t  =

 directly. But a synchronized statement will make the variable non-shar=

ed  =

 within its bounds.

 Now, if you want a custom mutex class, write it like this:

 	synchronized(SpinLock) int i;

 	synchronized(i)
 	{
 		// implicit: i.mutexof.lock();
 		// implicit: scope (exit) i.mutexof.unlock();
 		i++;
 	}

 If you want to declare the mutex separately, you could do it by  =

 specifying a variable instead of a type in the variable declaration:

 	Mutex m;
 	synchronized(m) int i;
 	=

 	synchronized(i)
 	{
 		// implicit: m.lock();
 		// implicit: scope (exit) m.unlock();
 		i++;
 	}

 Also, if you have a read-write mutex and only need read access, you  =

 could declare that you only need read access using const:

 	synchronized(RWMutex) int i;

 	synchronized(const i)
 	{
 		// implicit: i.mutexof.constLock();
 		// implicit: scope (exit) i.mutexof.constUnlock();
 		i++; // error, i is const
 	}

 And finally, if you want to use atomic operations, declare it this way=

:
 	synchronized(Atomic) int i;

 You can't really synchronize on something protected by Atomic:

 	syncronized(i) // cannot make sycnronized block, no lock/unlock metho=

d  =

 in Atomic
 	{}

 But you can call operators on it while synchronized, it works for  =

 anything implemented by Atomic:

 	synchronized(i)++; // implicit: Atomic.opUnary!"++"(i);

 Because the policy object is associated with the variable declaration,=

  =

 when locking the mutex you need direct access to the original variable=

,  =

 or an alias to it. Same for performing atomic operations. You can't pa=

ss  =

 a reference to some function and have that function perform the lockin=

g.  =

 If that's a problem it can be avoided by having a way to pass the mute=

x  =

 to the function, or by passing an alias to a template.

+1

I suggested something similar as did S=F6nke:
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos=
t-op.wnnuiio554xghj:40puck.auriga.bhead.co.uk

According to deadalnix the compiler magic I suggested to add the mutex  =

isn't possible:
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D3#pos=
t-k7qsb5:242gqk:241:40digitalmars.com

Most of our ideas can be implemented with a wrapper template containing =
 =

the sync object (mutex, etc).

So... my feeling is that the best solution for "shared", ignoring the  =

memory barrier aspect which I would relegate to a different feature and =
 =

solve a different way, is..

1. Remove the existing mutex from object.
2. Require that all objects passed to synchronized() {} statements  =

implement a synchable(*) interface
3. Design a Shared(*) wrapper template/struct that contains a mutex and =
 =

implements synchable(*)
4. Design a Shared(*) base class which contains a mutex and implements  =

synchable(*)

Then we design classes which are always shared using the base class and =
we  =

wrap other objects we want to share in Shared() and use them in  =

synchronized statements.

This would then relegate any builtin "shared" statement to be solely a  =

storage class which makes the object global and not thread local.

(*) names up for debate

R

-- =

Using Opera's revolutionary email client: http://www.opera.com/mail/

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 15, 2012, at 3:16 AM, Regan Heath <regan netmail.co.nz> wrote:
=20
 I suggested something similar as did S=F6nke:
 =

http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#post-=
op.wnnuiio554xghj:40puck.auriga.bhead.co.uk
=20
 According to deadalnix the compiler magic I suggested to add the mutex =

isn't possible:
 =

http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D3#post-=
k7qsb5:242gqk:241:40digitalmars.com
=20
 Most of our ideas can be implemented with a wrapper template =

containing the sync object (mutex, etc).

If I understand you correctly, you don't need anything that explicitly =
contains the sync object.  A global table of mutexes used according to =
the address of the value to be mutated should work.


 So... my feeling is that the best solution for "shared", ignoring the =

memory barrier aspect which I would relegate to a different feature and =
solve a different way, is..
=20
 1. Remove the existing mutex from object.
 2. Require that all objects passed to synchronized() {} statements =

implement a synchable(*) interface
 3. Design a Shared(*) wrapper template/struct that contains a mutex =

and implements synchable(*)
 4. Design a Shared(*) base class which contains a mutex and implements =

synchable(*)

It would be nice to eliminate the mutex that's optionally built into =
classes now.  The possibility of having to allocate a new mutex on =
whatever random function call happens to be the first one with =
"synchronized" is kinda not great.=

Nov 15 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11/15/2012 8:33 AM, Michel Fortin пишет:

 If you want to declare the mutex separately, you could do it by
 specifying a variable instead of a type in the variable declaration:

      Mutex m;
      synchronized(m) int i;

      synchronized(i)
      {
          // implicit: m.lock();
          // implicit: scope (exit) m.unlock();
          i++;
      }

While the rest of proposal was more or less fine. I don't get why we 
need escape control of mutex at all - in any case it just opens a 
possibility to shout yourself in the foot.

I'd say:
"Need direct access to mutex? - Go on with the manual way it's still 
right there (and scope(exit) for that matter)".

Another problem is that somebody clever can escape reference to unlocked 
'i' inside of synchronized to somewhere else.

But anyway we can make it in the library right about now.

synchronized T ---> Synchronized!T
synchronized(i){ ... } --->

i.access((x){
//will lock & cast away shared T inside of it
	...
});

I fail to see what it doesn't solve (aside of syntactic sugar).

The key point is that Synchronized!T is otherwise an opaque type.
We could pack a few other simple primitives like 'load', 'store' etc. 
All of them will go through lock-unlock.

Even escaping a reference can be solved by passing inside of 'access'
a proxy of T. It could even asserts that the lock is in indeed locked.

Same goes about Atomic!T. Though the set of primitives is quite limited 
depending on T.
(I thought that built-in shared(T) is already atomic though so no need 
to reinvent this wheel)

It's time we finally agree that 'shared' qualifier is an assembly 
language of multi-threading based on sharing. It just needs some safe 
patterns in the library.

That and clarifying explicitly what guarantees (aside from being well.. 
being shared) it provides w.r.t. memory model.

Until reaching this thread I was under impression that shared means:
- globally visible
- atomic operations for stuff that fits in one word
- sequentially consistent guarantee
- any other forms of access are disallowed except via casts

-- 
Dmitry Olshansky

Nov 15 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:

 11/15/2012 8:33 AM, Michel Fortin пишет:
 
 If you want to declare the mutex separately, you could do it by
 specifying a variable instead of a type in the variable declaration:
 
      Mutex m;
      synchronized(m) int i;
 
      synchronized(i)
      {
          // implicit: m.lock();
          // implicit: scope (exit) m.unlock();
          i++;
      }

 
 While the rest of proposal was more or less fine. I don't get why we 
 need escape control of mutex at all - in any case it just opens a 
 possibility to shout yourself in the foot.

In case you want to protect two variables (or more) with the same 
mutex. For instance:

	Mutex m;
	synchronized(m) int next_id;
	synchronized(m) Object[int] objects_by_id;

	int addObject(Object o)
	{
		synchronized(next_id, objects_by_id)
			return objects_by_id[next_id++] = o;
	}

Here it doesn't make sense and is less efficient to have two mutexes, 
since every time you need to lock on next_id you'll also want to lock 
on objects_by_id.

I'm not sure how you could shoot yourself in the foot with this. You 
might get worse performance if you reuse the same mutex for too many 
things, just like you might get better performance if you use it wisely.


 But anyway we can make it in the library right about now.
 
 synchronized T ---> Synchronized!T
 synchronized(i){ ... } --->
 
 i.access((x){
 //will lock & cast away shared T inside of it
 	...
 });
 
 I fail to see what it doesn't solve (aside of syntactic sugar).

It solves the problem too. But it's significantly more inconvenient to 
use. Here's my example above redone using Syncrhonized!T:

	Synchronized!(Tuple!(int, Object[int])) objects_by_id;

	int addObject(Object o)
	{
		int id;
		objects_by_id.access((obj_by_id){
			id = obj_by_id[1][obj_by_id[0]++] = o;
		};
		return id;
	}

I'm not sure if I have to explain why I prefer the first one or not, to 
me it's pretty obvious.


 The key point is that Synchronized!T is otherwise an opaque type.
 We could pack a few other simple primitives like 'load', 'store' etc. 
 All of them will go through lock-unlock.

Our proposals are pretty much identical. Your works by wrapping a 
variable in a struct template, mine is done with a policy object/struct 
associated with a variable. They'll produce the same code and impose 
the same restrictions.


 Even escaping a reference can be solved by passing inside of 'access'
 a proxy of T. It could even asserts that the lock is in indeed locked.

Only if you can make a proxy object that cannot leak a reference. It's 
already not obvious how to not leak the top-level reference, but we 
must also consider the case where you're protecting a data structure 
with the mutex and get a pointer to one of its part, like if you slice 
a container.

This is a hard problem. The language doesn't have a solution to that 
yet. However, having the link between the access policy and the 
variable known by the compiler makes it easier patch the hole later.

What bothers me currently is that because we want to patch all the 
holes while not having all the necessary tools in the language to avoid 
escaping references, we just make using mutexes and things alike 
impossible without casts at every corner, which makes things even more 
bug prone than being able to escape references in the first place.

There are many perils in concurrency, and the compiler cannot protect 
you from them all. It is of the uttermost importance that code dealing 
with mutexes be both readable and clear about what it is doing. Casts 
in this context are an obfuscator.


 Same goes about Atomic!T. Though the set of primitives is quite limited 
 depending on T.
 (I thought that built-in shared(T) is already atomic though so no need 
 to reinvent this wheel)
 
 It's time we finally agree that 'shared' qualifier is an assembly 
 language of multi-threading based on sharing. It just needs some safe 
 patterns in the library.
 
 That and clarifying explicitly what guarantees (aside from being well.. 
 being shared) it provides w.r.t. memory model.
 
 Until reaching this thread I was under impression that shared means:
 - globally visible
 - atomic operations for stuff that fits in one word
 - sequentially consistent guarantee
 - any other forms of access are disallowed except via casts

Built-in shared(T) atomicity (sequential consistency) is a subject of 
debate in this thread. It is not clear to me what will be the 
conclusion, but the way I see things atomicity is just one of the many 
policies you may want to use for keeping consistency when sharing data 
between threads.

I'm not trilled by the idea of making everything atomic by default. 
That'll lure users to the bug-prone expert-only path while relegating 
the more generally applicable protection systems (mutexes) as a 
second-class citizen. I think it's better that you just can't do 
anything with shared, or that shared simply disappear, and that those 
variables that must be shared be accessible only through some kind of 
access policy. Atomic access should be one of those access policies, on 
an equal footing with other ones.

But if D2 is still "frozen" -- as it was meant to be when TDPL got out 
-- and only minor changes can be made to it now, I don't see much hope 
for its concurrency model. Your Syncronized!T and Atomic!T wrappers 
might be the best thing we can hope for, but they're nothing to set D 
apart from its rivals (I could implement that easily in C++ for 
instance).

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 16 2012

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

Am 16.11.2012 14:17, schrieb Michel Fortin:
 
 Only if you can make a proxy object that cannot leak a reference. It's already
not obvious how to
 not leak the top-level reference, but we must also consider the case where
you're protecting a data
 structure with the mutex and get a pointer to one of its part, like if you
slice a container.
 
 This is a hard problem. The language doesn't have a solution to that yet.
However, having the link
 between the access policy and the variable known by the compiler makes it
easier patch the hole later.
 
 What bothers me currently is that because we want to patch all the holes while
not having all the
 necessary tools in the language to avoid escaping references, we just make
using mutexes and things
 alike impossible without casts at every corner, which makes things even more
bug prone than being
 able to escape references in the first place.
 
 There are many perils in concurrency, and the compiler cannot protect you from
them all. It is of
 the uttermost importance that code dealing with mutexes be both readable and
clear about what it is
 doing. Casts in this context are an obfuscator.
 

Can you have a look at my thread about this?
http://forum.dlang.org/thread/k831b6$1368$1 digitalmars.com

I would of course favor a nicely integrated language solution that is able to
lift as many
restrictions as possible, while still keeping everything statically verified [I
would also like to
have a language solution to Rebindable!T ;)]. But as an alternative to just a
years lasting
discussion, which does not lead to any agreed upon solution, I'd much rather
have such a library
solution - it can do a lot, is reasonably pretty, and is (supposedly and with a
small exception)
fully safe.

Nov 16 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin michelf.ca> =
wrote:

 On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> =

said:
=20
 11/15/2012 8:33 AM, Michel Fortin =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
 If you want to declare the mutex separately, you could do it by
 specifying a variable instead of a type in the variable declaration:
     Mutex m;
     synchronized(m) int i;
     synchronized(i)
     {
         // implicit: m.lock();
         // implicit: scope (exit) m.unlock();
         i++;
     }

 While the rest of proposal was more or less fine. I don't get why we =


need escape control of mutex at all - in any case it just opens a =
possibility to shout yourself in the foot.
=20
 In case you want to protect two variables (or more) with the same =

mutex. For instance:
=20
 	Mutex m;
 	synchronized(m) int next_id;
 	synchronized(m) Object[int] objects_by_id;
=20
 	int addObject(Object o)
 	{
 		synchronized(next_id, objects_by_id)
 			return objects_by_id[next_id++] =3D o;
 	}
=20
 Here it doesn't make sense and is less efficient to have two mutexes, =

since every time you need to lock on next_id you'll also want to lock on =
objects_by_id.
=20
 I'm not sure how you could shoot yourself in the foot with this. You =

might get worse performance if you reuse the same mutex for too many =
things, just like you might get better performance if you use it wisely.

This is what setSameMutex was intended for in Druntime.  Except that no =
one uses it and people have requested that it be removed.  Perhaps =
that's because the semantics aren't great though.=

Nov 16 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-16 15:23:37 +0000, Sean Kelly <sean invisibleduck.org> said:

 On Nov 16, 2012, at 5:17 AM, Michel Fortin <michel.fortin michelf.ca> wrote:
 
 On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:
 
 While the rest of proposal was more or less fine. I don't get why we
 need escape control of mutex at all - in any case it just opens a
 possibility to shout yourself in the foot.

 
 In case you want to protect two variables (or more) with the same
 mutex.

 
 This is what setSameMutex was intended for in Druntime.  Except that no
 one uses it and people have requested that it be removed.  Perhaps
 that's because the semantics aren't great though.

Perhaps it's just my style of coding, but when designing a class that 
needs to be shared in C++, I usually use one mutex to protect only a 
couple of variables inside the object. That might mean I have two 
mutexes in one class for two sets of variables if it fits the access 
pattern. I also make the mutex private so that derived classes cannot 
access it. The idea is to strictly control what happens when each mutex 
is locked so that I can make sure I never have two mutexes locked at 
the same time without looking at the whole code base. This is to avoid 
deadlocks, and also it removes the need for recursive mutexes.

I'd like the language to help me enforce this pattern, and what I'm 
proposing goes in that direction.

Regarding setSameMutex, I'd argue that the semantics of having one 
mutex for a whole object isn't great. Mutexes shouldn't protect types, 
they should protect variables. Whether a class needs to protect its 
variables and how it does it is an implementation detail that shouldn't 
be leaked to the outside world. What the outside world should know is 
whether the object is thread-safe or not.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 17 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11/16/2012 5:17 PM, Michel Fortin пишет:
 On 2012-11-15 16:08:35 +0000, Dmitry Olshansky <dmitry.olsh gmail.com>
 said:
 While the rest of proposal was more or less fine. I don't get why we
 need escape control of mutex at all - in any case it just opens a
 possibility to shout yourself in the foot.

 In case you want to protect two variables (or more) with the same mutex.
 For instance:

      Mutex m;
      synchronized(m) int next_id;
      synchronized(m) Object[int] objects_by_id;

Wrap in a struct and it would be even much clearer and safer.
struct ObjectRepository {
	int next_id;
	Object[int] objects_by_id;
}
//or whatever that combination indicates anyway
synchronized ObjectRepository objeRepo;


      int addObject(Object o)
      {
          synchronized(next_id, objects_by_id)

...synchronized(objRepo) with(objRepo)...
Though I'd rather use it as struct directly.

              return objects_by_id[next_id++] = o;
      }

 Here it doesn't make sense and is less efficient to have two mutexes,
 since every time you need to lock on next_id you'll also want to lock on
 objects_by_id.

Yes. But we shouldn't close our eyes on the rest of language for how to 
implement this. Moreover it makes more sense to pack related stuff (that 
is under a single lock) into a separate entity.

 I'm not sure how you could shoot yourself in the foot with this. You
 might get worse performance if you reuse the same mutex for too many
 things, just like you might get better performance if you use it wisely.

Easily - now the mutex is separate and there is no guarantee that it 
won't get used for something else then intended. The declaration implies 
the connection but I do not see anything preventing it from abuse.

 But anyway we can make it in the library right about now.

 synchronized T ---> Synchronized!T
 synchronized(i){ ... } --->

 i.access((x){
 //will lock & cast away shared T inside of it
     ...
 });

 I fail to see what it doesn't solve (aside of syntactic sugar).

 It solves the problem too. But it's significantly more inconvenient to
 use. Here's my example above redone using Syncrhonized!T:

      Synchronized!(Tuple!(int, Object[int])) objects_by_id;

      int addObject(Object o)
      {
          int id;
          objects_by_id.access((obj_by_id){
              id = obj_by_id[1][obj_by_id[0]++] = o;
          };
          return id;
      }

 I'm not sure if I have to explain why I prefer the first one or not, to
 me it's pretty obvious.

If we made a tiny change in the language that would allow different 
syntax for passing delegates mine would shine. Such a change at the same 
time enables more nice way to abstract away control flow.

Imagine:

access(object_by_id){
	...	
};

to be convertible to:

(x){with(x){
	...
}})(access(object_by_id));

More generally speaking a lowering:

expression { ... }
-->
(x){with(x){ ... }}(expression);

AFIAK it doesn't conflict with anything.

Or wait a sec. Even simpler idiom and no extra features.
Drop the idea of 'access' taking a delegate. The other library idiom is 
to return a RAII proxy that locks/unlocks an object on construction/destroy.

with(lock(object_by_id))
{
	... do what you like
}


Fine by me. And C++ can't do it ;)


 The key point is that Synchronized!T is otherwise an opaque type.
 We could pack a few other simple primitives like 'load', 'store' etc.
 All of them will go through lock-unlock.

 Our proposals are pretty much identical. Your works by wrapping a
 variable in a struct template, mine is done with a policy object/struct
 associated with a variable. They'll produce the same code and impose the
 same restrictions.

I kind of wanted to point out this disturbing thought about your 
proposal. That is a lot of extra syntax and rules added buys us very 
small gain - prettier syntax.

 Even escaping a reference can be solved by passing inside of 'access'
 a proxy of T. It could even asserts that the lock is in indeed locked.

 Only if you can make a proxy object that cannot leak a reference. It's
 already not obvious how to not leak the top-level reference, but we must
 also consider the case where you're protecting a data structure with the
 mutex and get a pointer to one of its part, like if you slice a container.

 This is a hard problem. The language doesn't have a solution to that
 yet. However, having the link between the access policy and the variable
 known by the compiler makes it easier patch the hole later.

It need not be 100% malicious dambass proof. Basic foolproofness is OK.
See my sketch, it could be vastly improved:
https://gist.github.com/4089706

See also Ludwig's work. Though he is focused on classes and their 
monitor mutex.

 What bothers me currently is that because we want to patch all the holes
 while not having all the necessary tools in the language to avoid
 escaping references, we just make using mutexes and things alike
 impossible without casts at every corner, which makes things even more
 bug prone than being able to escape references in the first place.

Well it kind of double-edged.

However I do think we need more general tools in the language and niche 
ones in the library. Precisely because you can pack tons of niche and 
miscellaneous stuff on the bookshelf ;)

Locks & the works are niche stuff enabling a lot more of common things.

 There are many perils in concurrency, and the compiler cannot protect
 you from them all. It is of the uttermost importance that code dealing
 with mutexes be both readable and clear about what it is doing. Casts in
 this context are an obfuscator.

See below about high-level primitives. The code dealing with mutexes has 
to be small and isolated anyway. Encouraging pattern of 'just grab the 
lock and you are golden' is even worse (cause it won't break as fast and 
hard as e.g. naive atomics will).


 That and clarifying explicitly what guarantees (aside from being
 well.. being shared) it provides w.r.t. memory model.

 Until reaching this thread I was under impression that shared means:
 - globally visible
 - atomic operations for stuff that fits in one word
 - sequentially consistent guarantee
 - any other forms of access are disallowed except via casts

 Built-in shared(T) atomicity (sequential consistency) is a subject of
 debate in this thread. It is not clear to me what will be the
 conclusion, but the way I see things atomicity is just one of the many
 policies you may want to use for keeping consistency when sharing data
 between threads.

 I'm not trilled by the idea of making everything atomic by default.
 That'll lure users to the bug-prone expert-only path while relegating
 the more generally applicable protection systems (mutexes) as a
 second-class citizen.

That's why I think people shouldn't have to use mutexes at all.
Explicitly - provide folks with blocking queues, Synchronized!T, 
concurrent containers  (e.g. hash map) and what not. Even Java has some 
useful incarnations of these.

 I think it's better that you just can't do
 anything with shared, or that shared simply disappear, and that those
 variables that must be shared be accessible only through some kind of
 access policy. Atomic access should be one of those access policies, on
 an equal footing with other ones.

This is where casts will be a most unwelcome obfuscator and there is no 
sensible way to de-obscure it by using higher level primitives. Having 
to say Atomic!X is workable though.

 But if D2 is still "frozen" -- as it was meant to be when TDPL got out
 -- and only minor changes can be made to it now, I don't see much hope
 for its concurrency model. Your Syncronized!T and Atomic!T wrappers
 might be the best thing we can hope for, but they're nothing to set D
 apart from its rivals (I could implement that easily in C++ for instance).

Yeah, but we may tweak some syntax in terms of one lowering or a couple. 
I'm of strong opinion that lock-based multi-threading needs no 
_specific_ built-in support in the language.

The case is niche and hardly useful outside of certain help with doing 
safe high-level primitives in the library. As for client code it doesn't 
care that much.
Compared to C++ there is one big thing. That is no-shared by default. 
This alone should be immensely helpful especially when dealing with 3rd 
party libraries that 'try hard to be thread-safe' except that they are 
usually not.

-- 
Dmitry Olshansky

Nov 16 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:

 11/16/2012 5:17 PM, Michel Fortin пишет:
 In case you want to protect two variables (or more) with the same mutex.
 For instance:
 
      Mutex m;
      synchronized(m) int next_id;
      synchronized(m) Object[int] objects_by_id;
 

 
 Wrap in a struct and it would be even much clearer and safer.
 struct ObjectRepository {
 	int next_id;
 	Object[int] objects_by_id;
 }
 //or whatever that combination indicates anyway
 synchronized ObjectRepository objeRepo;

I guess that'd be fine too.


 If we made a tiny change in the language that would allow different 
 syntax for passing delegates mine would shine. Such a change at the 
 same time enables more nice way to abstract away control flow.
 
 Imagine:
 
 access(object_by_id){
 	...	
 };
 
 to be convertible to:
 
 (x){with(x){
 	...
 }})(access(object_by_id));
 
 More generally speaking a lowering:
 
 expression { ... }
 -->
 (x){with(x){ ... }}(expression);
 
 AFIAK it doesn't conflict with anything.
 
 Or wait a sec. Even simpler idiom and no extra features.
 Drop the idea of 'access' taking a delegate. The other library idiom is 
 to return a RAII proxy that locks/unlocks an object on 
 construction/destroy.
 
 with(lock(object_by_id))
 {
 	... do what you like
 }
 
 Fine by me. And C++ can't do it ;)

Clever. But you forgot to access the variable somewhere. What's its 
name within the with block? Your code would be clearer this way:

	{
		auto locked_object_by_id = lock(object_by_id);
		// … do what you like
	}

And yes you can definitely do that in C++.

I maintain that the "synchronized (var)" syntax is still much clearer, 
and greppable too. That could be achieved with an appropriate lowering.


 The key point is that Synchronized!T is otherwise an opaque type.
 We could pack a few other simple primitives like 'load', 'store' etc.
 All of them will go through lock-unlock.

 
 Our proposals are pretty much identical. Your works by wrapping a
 variable in a struct template, mine is done with a policy object/struct
 associated with a variable. They'll produce the same code and impose the
 same restrictions.

 
 I kind of wanted to point out this disturbing thought about your 
 proposal. That is a lot of extra syntax and rules added buys us very 
 small gain - prettier syntax.

Sometime having something built in the language is important: it gives 
first-class status to some constructs. For instance: arrays. We don't 
need language-level arrays in D, we could just use a struct template 
that does the same thing. By integrating a feature into the language 
we're sending the message that this is *the* way to do it, as no other 
way can stand on equal footing, preventing infinite reimplementation of 
the concept within various libraries.

You might be right however than mutex-protected variables do not 
deserve this first class status.


 Built-in shared(T) atomicity (sequential consistency) is a subject of
 debate in this thread. It is not clear to me what will be the
 conclusion, but the way I see things atomicity is just one of the many
 policies you may want to use for keeping consistency when sharing data
 between threads.
 
 I'm not trilled by the idea of making everything atomic by default.
 That'll lure users to the bug-prone expert-only path while relegating
 the more generally applicable protection systems (mutexes) as a
 second-class citizen.

 
 That's why I think people shouldn't have to use mutexes at all.
 Explicitly - provide folks with blocking queues, Synchronized!T, 
 concurrent containers  (e.g. hash map) and what not. Even Java has some 
 useful incarnations of these.

I wouldn't say they shouldn't use mutexes at all, but perhaps you're 
right that they don't deserve first-class treatment. I still maintain 
that "syncronized (var)" should work, for clarity and consistency 
reasons, but using a template such as Synchronized!T when declaring the 
variable might be the best solution.


-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 17 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-17 14:22, Michel Fortin wrote:

 Sometime having something built in the language is important: it gives
 first-class status to some constructs. For instance: arrays. We don't
 need language-level arrays in D, we could just use a struct template
 that does the same thing. By integrating a feature into the language
 we're sending the message that this is *the* way to do it, as no other
 way can stand on equal footing, preventing infinite reimplementation of
 the concept within various libraries.

If a feature can be implemented in a library with the same syntax, 
semantic and performance I see no reason to put it in the language.

-- 
/Jacob Carlborg

Nov 17 2012

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

11/17/2012 5:22 PM, Michel Fortin пишет:
 On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com>
 Or wait a sec. Even simpler idiom and no extra features.
 Drop the idea of 'access' taking a delegate. The other library idiom
 is to return a RAII proxy that locks/unlocks an object on
 construction/destroy.

 with(lock(object_by_id))
 {
     ... do what you like
 }

 Fine by me. And C++ can't do it ;)

 Clever. But you forgot to access the variable somewhere. What's its name
 within the with block?

Not having the name would imply you can't escape it :) But I agree it's 
not always clear where the writes go to when doing things inside the 
with block.

Your code would be clearer this way:

      {
          auto locked_object_by_id = lock(object_by_id);
          // … do what you like
      }

 And yes you can definitely do that in C++.

Well, I actually did it in the past when C++0x was relatively new.
I just thought 'with' makes it more interesting. As to how access the 
variable - it depends on what it is.

 I maintain that the "synchronized (var)" syntax is still much clearer,
 and greppable too. That could be achieved with an appropriate lowering.

Yes! If we could make synchronized to be user-hookable this all would be 
more clear and generally useful. There was a discussion about providing 
a user defined semantics for synchronized block. It was clear and useful 
and a lot of folks were favorable of it. Yet it wasn't submitted as a 
proposal.

All other things being equal I believe we should go in this direction - 
amend a couple of things (say add a user-hookable synchronized) and 
start laying bricks for std.sharing.


-- 
Dmitry Olshansky

Nov 17 2012

"foobar" <foo bar.com> writes:

On Saturday, 17 November 2012 at 13:22:23 UTC, Michel Fortin 
wrote:
 On 2012-11-16 18:56:28 +0000, Dmitry Olshansky 
 <dmitry.olsh gmail.com> said:

 11/16/2012 5:17 PM, Michel Fortin пишет:
 In case you want to protect two variables (or more) with the 
 same mutex.
 For instance:
 
     Mutex m;
     synchronized(m) int next_id;
     synchronized(m) Object[int] objects_by_id;
 

 
 Wrap in a struct and it would be even much clearer and safer.
 struct ObjectRepository {
 	int next_id;
 	Object[int] objects_by_id;
 }
 //or whatever that combination indicates anyway
 synchronized ObjectRepository objeRepo;

 I guess that'd be fine too.

<snip>

That solution does not work in the general case. More 
specifically any graph-like data structure. E.g a linked-lists, 
trees, etc..
Think for example an insert to a shared AVL tree.

Nov 19 2012

Michel Fortin <michel.fortin michelf.ca> writes:

On 2012-11-19 09:31:46 +0000, "foobar" <foo bar.com> said:

 On Saturday, 17 November 2012 at 13:22:23 UTC, Michel Fortin wrote:
 On 2012-11-16 18:56:28 +0000, Dmitry Olshansky <dmitry.olsh gmail.com> said:
 
 11/16/2012 5:17 PM, Michel Fortin пишет:
 In case you want to protect two variables (or more) with the same mutex.
 For instance:
 
     Mutex m;
     synchronized(m) int next_id;
     synchronized(m) Object[int] objects_by_id;
 

 
 Wrap in a struct and it would be even much clearer and safer.
 struct ObjectRepository {
 	int next_id;
 	Object[int] objects_by_id;
 }
 //or whatever that combination indicates anyway
 synchronized ObjectRepository objeRepo;

 
 I guess that'd be fine too.

 
 <snip>
 
 That solution does not work in the general case. More specifically any 
 graph-like data structure. E.g a linked-lists, trees, etc..
 Think for example an insert to a shared AVL tree.

No solution will be foolproof in the general case unless we add new 
type modifiers to the language to prevent escaping references, 
something Walter is reluctant to do. So whatever we do with mutexes 
it'll always be a leaky abstraction. I'm not too trilled by this either.

-- 
Michel Fortin
michel.fortin michelf.ca
http://michelf.ca/

Nov 19 2012

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 11/15/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 From what I recall of what TDPL says

It says (on p.413) reading and writing shared values are guaranteed to
be atomic, for pointers, arrays, function pointers, delegates, class
references, and struct types containing exactly one of these types.
Reals are not supported.

It also talks about automatically inserting memory barriers on page 414.

Nov 14 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required. And I think that introducing the shared equivalent of
 const would be a huge mistake, because then most code would end up being
 written using that attribute, meaning that all code essentially has to be
 treated as shared from the standpoint of compiler optimizations. It would
 almost be the same as making everything shared by default again. So, as far
 as I can see, casting is what we're forced to do.

Actually, I think that what it comes down to is that shared works nicely when 
you have a type which is designed to be shared, and it encapsulates everything 
that it needs. Where it starts requiring casting is when you need to pass it 
to other stuff.

- Jonathan M Davis

Nov 14 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 11/14/12 7:24 PM, Jonathan M Davis wrote:
 On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required. And I think that introducing the shared equivalent of
 const would be a huge mistake, because then most code would end up being
 written using that attribute, meaning that all code essentially has to be
 treated as shared from the standpoint of compiler optimizations. It would
 almost be the same as making everything shared by default again. So, as far
 as I can see, casting is what we're forced to do.

 Actually, I think that what it comes down to is that shared works nicely when
 you have a type which is designed to be shared, and it encapsulates everything
 that it needs. Where it starts requiring casting is when you need to pass it
 to other stuff.

 - Jonathan M Davis

TDPL 13.14 explains that inside synchronized classes, top-level shared 
is automatically lifted.

Andrei

Nov 14 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, November 14, 2012 20:32:35 Andrei Alexandrescu wrote:
 TDPL 13.14 explains that inside synchronized classes, top-level shared
 is automatically lifted.

Then it's doing the casting for you. I suppose that that's an argument that 
using synchronized classes when dealing with shared is the way to go (which 
IIRC TDPL does argue), but that only applies to classes, and there are plenty 
of cases (maybe even the majority) where it's built-in types like arrays or 
AAs which people are trying to share, and synchronized classes won't help them 
there unless they create wrapper types. And explicit casting will be required 
for them. And of course, anyone wanting to use mutexes or synchronized blocks 
will have to use explicit casts regardless of what they're protecting, because 
it won't be inside a synchronized class. So, while synchronized classes make 
dealing with classes nicer, they only handle a very specific portion of  what 
might be used with shared.

In any case, I clearly need to reread TDPL's threading stuff (and maybe the 
whole book). It's been a while since I read it, and I'm getting rusty on the 
details.

By the way, speaking of synchronized classes, as I understand it, they're 
still broken with regards to TDPL in that synchronized is still used on 
functions rather than classes like TDPL describes. So, they aren't currently a 
solution regardless of what the language actual design is supposed to be. 
Obviously, that should be fixed though.

- Jonathan M Davis

Nov 15 2012

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

Am 15.11.2012 05:32, schrieb Andrei Alexandrescu:
 On 11/14/12 7:24 PM, Jonathan M Davis wrote:
 On Thursday, November 15, 2012 03:51:13 Jonathan M Davis wrote:
 I have no idea what we want to do about this situation though. Regardless of
 what we do with memory barriers and the like, it has no impact on whether
 casts are required. And I think that introducing the shared equivalent of
 const would be a huge mistake, because then most code would end up being
 written using that attribute, meaning that all code essentially has to be
 treated as shared from the standpoint of compiler optimizations. It would
 almost be the same as making everything shared by default again. So, as far
 as I can see, casting is what we're forced to do.

 Actually, I think that what it comes down to is that shared works nicely when
 you have a type which is designed to be shared, and it encapsulates everything
 that it needs. Where it starts requiring casting is when you need to pass it
 to other stuff.

 - Jonathan M Davis

 
 TDPL 13.14 explains that inside synchronized classes, top-level shared is
automatically lifted.
 
 Andrei

There are three problems I currently see with this:

 - It's not actually implemented
 - It's not safe because unshared references can be escaped or dragged in
 - Synchronized classes provide no way to avoid the automatic locking in
certain methods, but often
it is necessary to have more fine-grained control for efficiency reasons, or to
avoid dead-locks

Nov 15 2012

Manu <turkeyman gmail.com> writes:

On 15 November 2012 04:30, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 11/11/12 6:30 PM, Walter Bright wrote:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex

 This is very different from how I view we should do things (and how we
 actually agreed to do things and how I wrote in TDPL).

 I can't believe I need to restart this on a cold cache.


The pattern Walter describes is primitive and useful, I'd like to see
shared assist to that end (see my previous post).
You can endeavour to do any other fancy stuff you like, but until some
distant future when it's actually done, then proven and well supported,
I'll keep doing this.

Not to repeat my prev post... but in reply to Walter's take on it, it would
be interesting if 'shared' just added implicit lock()/unlock() methods to
do the mutex acquisition and then remove the cast requirement, but have the
language runtime assert that the object is locked whenever it is accessed
(this guarantees the safety in a more useful way, the casts are really
annying). I can't imagine a simpler and more immediately useful solution.

In fact, it's a reasonably small step to this being possible with
user-defined attributes. Although attributes have no current mechanism to
add a mutex, and lock/unlock methods to the object being attributed (like

Nov 15 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-15 10:22, Manu wrote:

 Not to repeat my prev post... but in reply to Walter's take on it, it
 would be interesting if 'shared' just added implicit lock()/unlock()
 methods to do the mutex acquisition and then remove the cast
 requirement, but have the language runtime assert that the object is
 locked whenever it is accessed (this guarantees the safety in a more
 useful way, the casts are really annying). I can't imagine a simpler and
 more immediately useful solution.

How about implementing a library function, something like this:

shared int i;

lock(i, (x) {
     // operate on x
});

* "lock" will acquire a lock
* Cast away shared for "i"
* Call the delegate with the now plain "int"
* Release the lock

http://pastebin.com/tfQ12nJB

-- 
/Jacob Carlborg

Nov 15 2012

Manu <turkeyman gmail.com> writes:

On 15 November 2012 12:14, Jacob Carlborg <doob me.com> wrote:

 On 2012-11-15 10:22, Manu wrote:

  Not to repeat my prev post... but in reply to Walter's take on it, it
 would be interesting if 'shared' just added implicit lock()/unlock()
 methods to do the mutex acquisition and then remove the cast
 requirement, but have the language runtime assert that the object is
 locked whenever it is accessed (this guarantees the safety in a more
 useful way, the casts are really annying). I can't imagine a simpler and
 more immediately useful solution.

 How about implementing a library function, something like this:

 shared int i;

 lock(i, (x) {
     // operate on x
 });

 * "lock" will acquire a lock
 * Cast away shared for "i"
 * Call the delegate with the now plain "int"
 * Release the lock

 http://pastebin.com/tfQ12nJB


Interesting concept. Nice idea, could certainly be useful, but it doesn't
address the problem as directly as my suggestion.
There are still many problem situations, for instance, any time a template
is involved. The template doesn't know to do that internally, but under my
proposal, you lock it prior to the workload, and then the template works as
expected. Templates won't just break and fail whenever shared is involved,
because assignments would be legal. They'll just assert that the thing is
locked at the time, which is the programmers responsibility to ensure.

Nov 15 2012

luka8088 <luka8088 owave.net> writes:

On 15.11.2012 11:52, Manu wrote:
On 15 November 2012 12:14, Jacob Carlborg <doob me.com
<mailto:doob me.com>> wrote:

On 2012-11-15 10:22, Manu wrote:

Not to repeat my prev post... but in reply to Walter's take on
it, it
would be interesting if 'shared' just added implicit lock()/unlock()
methods to do the mutex acquisition and then remove the cast
requirement, but have the language runtime assert that the object is
locked whenever it is accessed (this guarantees the safety in a more
useful way, the casts are really annying). I can't imagine a
simpler and
more immediately useful solution.

How about implementing a library function, something like this:

shared int i;

lock(i, (x) {
// operate on x
});

* "lock" will acquire a lock
* Cast away shared for "i"
* Call the delegate with the now plain "int"
* Release the lock

http://pastebin.com/tfQ12nJB

Interesting concept. Nice idea, could certainly be useful, but it
doesn't address the problem as directly as my suggestion.
There are still many problem situations, for instance, any time a
template is involved. The template doesn't know to do that internally,
but under my proposal, you lock it prior to the workload, and then the
template works as expected. Templates won't just break and fail whenever
shared is involved, because assignments would be legal. They'll just
assert that the thing is locked at the time, which is the programmers
responsibility to ensure.

I managed to make a simple example that works with the current
implementation:

http://dpaste.dzfl.pl/27b6df62

http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=4#post-k7s0gs:241h45:241:40digitalmars.com

It seems to me that solving this shared issue cannot be done purely on a
compiler basis but will require a runtime support. Actually I don't see
how it can be done properly without telling "this lock must be locked
when accessing this variable".

http://dpaste.dzfl.pl/edbd3e10

Nov 15 2012

Jacob Carlborg <doob me.com> writes:

On 2012-11-15 11:52, Manu wrote:

 Interesting concept. Nice idea, could certainly be useful, but it
 doesn't address the problem as directly as my suggestion.
 There are still many problem situations, for instance, any time a
 template is involved. The template doesn't know to do that internally,
 but under my proposal, you lock it prior to the workload, and then the
 template works as expected. Templates won't just break and fail whenever
 shared is involved, because assignments would be legal. They'll just
 assert that the thing is locked at the time, which is the programmers
 responsibility to ensure.

I don't understand how a template would cause problems.

-- 
/Jacob Carlborg

Nov 15 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, November 15, 2012 11:22:30 Manu wrote:
 Not to repeat my prev post... but in reply to Walter's take on it, it would
 be interesting if 'shared' just added implicit lock()/unlock() methods to
 do the mutex acquisition and then remove the cast requirement, but have the
 language runtime assert that the object is locked whenever it is accessed
 (this guarantees the safety in a more useful way, the casts are really
 annying). I can't imagine a simpler and more immediately useful solution.
 
 In fact, it's a reasonably small step to this being possible with
 user-defined attributes. Although attributes have no current mechanism to
 add a mutex, and lock/unlock methods to the object being attributed (like


1. It wouldn't stop you from needing to cast away shared at all, because 
without casting away shared, you wouldn't be able to pass it to anything, 
because the types would differ. Even if you were arguing that doing something 
like

void foo(C c) {...}
shared c = new C;
foo(c); //no cast required, lock automatically taken

it wouldn't work, because then foo could wile away a reference to c somewhere, 
and the type system would have no way of knowing that it was a shared variable 
that was being wiled away as opposed to a thread-local one, which means that 
it'll likely generate incorrect code. That can happen with the cast as well, 
but at least in that case, you're forced to be explicit about it, and it's 
automatically  system. If it's done for you, it'll be easy to miss and screw 
up.

2. It's often the case that you need to lock/unlock groups of stuff together 
such that locking specific variables is of often of limited use and would just 
introduce pointless extra locks when dealing with multiple variables. It would 
also increase the risk of deadlocks, because you wouldn't have much - if any - 
control over what order locks were acquired in when dealing with multiple 
shared variables.

- Jonathan M Davis

Nov 15 2012

Manu <turkeyman gmail.com> writes:

On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Thursday, November 15, 2012 11:22:30 Manu wrote:
 Not to repeat my prev post... but in reply to Walter's take on it, it

 would
 be interesting if 'shared' just added implicit lock()/unlock() methods to
 do the mutex acquisition and then remove the cast requirement, but have

 the
 language runtime assert that the object is locked whenever it is accessed
 (this guarantees the safety in a more useful way, the casts are really
 annying). I can't imagine a simpler and more immediately useful solution.

 In fact, it's a reasonably small step to this being possible with
 user-defined attributes. Although attributes have no current mechanism to
 add a mutex, and lock/unlock methods to the object being attributed (like


 1. It wouldn't stop you from needing to cast away shared at all, because
 without casting away shared, you wouldn't be able to pass it to anything,
 because the types would differ. Even if you were arguing that doing
 something
 like

 void foo(C c) {...}
 shared c = new C;
 foo(c); //no cast required, lock automatically taken

 it wouldn't work, because then foo could wile away a reference to c
 somewhere,
 and the type system would have no way of knowing that it was a shared
 variable
 that was being wiled away as opposed to a thread-local one, which means
 that
 it'll likely generate incorrect code. That can happen with the cast as
 well,
 but at least in that case, you're forced to be explicit about it, and it's
 automatically  system. If it's done for you, it'll be easy to miss and
 screw
 up.

I don't really see the difference, other than, as you say, the cast is
explicit.
Obviously the possibility for the situation you describe exists, it's
equally possible with the cast, except this way, the usage pattern is made
more convenient, the user has a convenient way to control the locks and
most importantly, it would work with templates.
That said, this sounds like another perfect application of 'scope'. Perhaps
only scope parameters can receive a locked, shared thing... that would
mechanically protect you against escape.

2. It's often the case that you need to lock/unlock groups of stuff together
 such that locking specific variables is of often of limited use and would
 just
 introduce pointless extra locks when dealing with multiple variables. It
 would
 also increase the risk of deadlocks, because you wouldn't have much - if
 any -
 control over what order locks were acquired in when dealing with multiple
 shared variables.


Your fear is precisely the state we're in now, except it puts all the work
on the user to create and use the synchronisation objects, and also to
assert that things are locked when they are accessed.
I'm just suggesting some reasonably simple change that would make the
situation more usable and safer immediately, short of waiting for all these
fantastic designs being discussed having time to simmer and manifest.

Perhaps a usage mechanism could be more like:
shared int x, y, z;
synchronised with(x, y, z)
{
  // do work with x, y, z, all locked together.
}

Nov 15 2012

Sean Kelly <sean invisibleduck.org> writes:

On Nov 11, 2012, at 6:30 PM, Walter Bright <newshound2 digitalmars.com> =
wrote:
=20
 To make a shared type work in an algorithm, you have to:
=20
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex


So what happens if you pass a reference to the now non-shared object to =
a function that caches a local reference to it?  Half the point of the =
attribute is to protect us from accidents like this.=

Nov 15 2012

"Jason House" <jason.james.house gmail.com> writes:

On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright 
 <newshound2 digitalmars.com> wrote:
 
 To make a shared type work in an algorithm, you have to:
 
 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex


 So what happens if you pass a reference to the now non-shared 
 object to a function that caches a local reference to it?  Half 
 the point of the attribute is to protect us from accidents like 
 this.

The constructive thing to do may be to try and figure out what 
should users be allowed to do with locked shared data... I think 
the basic idea is that no references can be escaped; SafeD rules 
could probably help with that. Non-shared member functions might 
also need to be tagged with their ability to be called on locked, 
shared data.

Nov 17 2012

deadalnix <deadalnix gmail.com> writes:

Le 17/11/2012 05:49, Jason House a écrit :
 On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex


 So what happens if you pass a reference to the now non-shared object
 to a function that caches a local reference to it? Half the point of
 the attribute is to protect us from accidents like this.

 The constructive thing to do may be to try and figure out what should
 users be allowed to do with locked shared data... I think the basic idea
 is that no references can be escaped; SafeD rules could probably help
 with that. Non-shared member functions might also need to be tagged with
 their ability to be called on locked, shared data.

Nothing is safe if ownership cannot be statically proven. This is 
completely useless.

Nov 18 2012

=?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:

Am 19.11.2012 05:57, schrieb deadalnix:
 Le 17/11/2012 05:49, Jason House a écrit :
 On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex


 So what happens if you pass a reference to the now non-shared object
 to a function that caches a local reference to it? Half the point of
 the attribute is to protect us from accidents like this.

 The constructive thing to do may be to try and figure out what should
 users be allowed to do with locked shared data... I think the basic idea
 is that no references can be escaped; SafeD rules could probably help
 with that. Non-shared member functions might also need to be tagged with
 their ability to be called on locked, shared data.

 
 Nothing is safe if ownership cannot be statically proven. This is completely
useless.

But you can at least prove ownership under some limited circumstances. Limited,
but (without having
tested on a large scale) still practical.

Interest seems to be limited much more than those circumstances, but anyway:
http://forum.dlang.org/thread/k831b6$1368$1 digitalmars.com

(the same approach that I already posted in this thread, but in a state that
should be more or less
bullet proof)

Nov 19 2012

"Jason House" <jason.james.house gmail.com> writes:

On Monday, 19 November 2012 at 04:57:16 UTC, deadalnix wrote:
 Le 17/11/2012 05:49, Jason House a écrit :
 On Thursday, 15 November 2012 at 16:31:43 UTC, Sean Kelly 
 wrote:
 On Nov 11, 2012, at 6:30 PM, Walter Bright
 <newshound2 digitalmars.com> wrote:
 To make a shared type work in an algorithm, you have to:

 1. ensure single threaded access by aquiring a mutex
 2. cast away shared
 3. operate on the data
 4. cast back to shared
 5. release the mutex


 So what happens if you pass a reference to the now non-shared 
 object
 to a function that caches a local reference to it? Half the 
 point of
 the attribute is to protect us from accidents like this.

 The constructive thing to do may be to try and figure out what 
 should
 users be allowed to do with locked shared data... I think the 
 basic idea
 is that no references can be escaped; SafeD rules could 
 probably help
 with that. Non-shared member functions might also need to be 
 tagged with
 their ability to be called on locked, shared data.

 Nothing is safe if ownership cannot be statically proven. This 
 is completely useless.

Bartosz's design was very explicit about ownership, but was 
deemed too complex for D2. Shared was kept simple, but 
underpowered.

Here's what I remember of Bartosz's design:
- Shared object members are owned by the enclosung container 
unless explicitly marked otherwise
- lockfree shared data is marked differently
- Non-lockfree shared objects required locking them prior to 
access, but did not require separate shared and non-shared code.
- No sequential consistency

I really liked his design, but I think the explicit ownership 
part was considered too complex. There may still be something 
that can be done to improve D2, but I doubt it'd be a complete 
solution.

Nov 20 2012

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:

Am 11.11.2012 19:46, schrieb Alex R�nne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out an
 alternative system to shared, because I don't see shared actually being
 useful for real world work no matter what we do with it.
 

After reading Walter's comment, it suddenly seemed obvious that we are
currently using 'shared' the wrong way. Shared is just not meant to be
used on objects at all (or only in some special cases like
synchronization primitives). I just experimented a bit with a statically
checked library based solution and a nice way to use shared is to only
use it for disabling access to non-shared members while its monitor is
not locked. A ScopedLock proxy and a lock() function can be used for this:

---
class MyClass {
	void method();
}

void main()
{
	auto inst = new shared(MyClass);
	//inst.method(); // forbidden
	
	{
		ScopedLock!MyClass l = lock(inst);
		l.method(); // now allowed as long as 'l' is in scope
	}

	// can also be called like this:
	inst.lock().method();
}
---

ScopedLock is non-copyable and handles the dirty details of locking and
casting away 'shared' when its safe to do so. No tagging of the class
with 'synchronized' or 'shared' needs to be done and everything works
nicely without casts.

This comes with a restriction, though. Doing all this is only safe as
long as the instance is known to not contain any unisolated aliasing*.
So use would be restricted to types that contain only immutable or
unique/isolated references.

So I also implemented an Isolated!(T) type that is recognized by
ScopedLock, as well as functions such as spawn(). The resulting usage
can be seen in the example at the bottom.

It doesn't provide all the flexibility that a built-in 'isolated' type
would do, but the possible use cases at least look interesting. There
are still some details to be worked out, such as writing a spawn()
function that correctly moves Isolated!() parameters instead of copying
or the forward reference error mentioned in the example.

I'll now try and see if some of my earlier multi-threading designs fit
into this system.

---
import std.stdio;
import std.typecons;
import std.traits;
import stdx.typecons;

class SomeClass {

}

class Test {
	private {
		string m_test1 = "test 1";
		Isolated!SomeClass m_isolatedReference;
		// currently causes a size forward reference error:
		//Isolated!Test m_next;
	}

	this()
	{
		//m_next = ...;
	}

	void test1() const { writefln(m_test1); }
	void test2() const { writefln("test 2"); }
}

void main()
{
	writefln("Shared locking");
	// create a shared instance of Test - no members will
	// be accessible
	auto t = new shared(Test);
	{
		// temporarily lock t to make all non-shared members
		// safely available
		// lock() words only for objects with no unisolated
		// aliasing.
		ScopedLock!Test l = lock(t);
		l.test1();
		l.test2();
	}

	// passing a shared object to a different thread works as usual
	writefln("Shared spawn");
	spawn(&myThreadFunc1, t);

	// create an isolated instance of Test
	// currently, Test may not contain unisolated aliasing, but
	// this requirement may get lifted,
	// as long as only pure methods are called
	Isolated!Test u = makeIsolated!Test();

	// move ownership to a different function and recover
	writefln("Moving unique");
	Isolated!Test v = myThreadFunc2(u.move());

	// moving to a different thread also works
	writefln("Moving unique spawn");
	spawn(&myThreadFunc2, v.move());

	// another possibility is to convert to immutable
	auto w = makeIsolated!Test();
	writefln("Convert to immutable spawn");
	spawn(&myThreadFunc3, w.freeze());

	// or just loose the isolation and act on the base type
	writefln("Convert to mutable");
	auto x = makeIsolated!Test();
	Test xm = x.extract();
	xm.test1();
	xm.test2();
}

void myThreadFunc1(shared(Test) t)
{
	// call non-shared method on shared object
	t.lock().test1();
	t.lock().test2();
}

Isolated!Test myThreadFunc2(Isolated!Test t)
{
	// call methods as usual on an isolated object
	t.test1();
	t.test2();
	return t.move();
}

void myThreadFunc3(immutable(Test) t)
{
	t.test1();
	t.test2();
}


// fake spawn function just to test the type constraints
void spawn(R, ARGS...)(R function(ARGS) func, ARGS args)
{
	foreach( i, T; ARGS )
		static assert(!hasUnisolatedAliasing!T ||
			!hasUnsharedAliasing!T,
			"Parameter "~to!string(i)~" of type"
			~T.stringof~" has unshared or unisolated
			 aliasing. Cannot safely be passed to a
			different thread.");
	
	// TODO: do this in a different thread...
	// TODO: don't cheat with the 1-parameter move detection
	static if(__traits(compiles, func(args[0])) ) func(args);
	else func(args[0].move());
}
---


* shared aliasing would also be OK, but this is not yet handled by the
implementation.

Nov 12 2012

"Regan Heath" <regan netmail.co.nz> writes:

On Mon, 12 Nov 2012 11:41:00 -0000, S=F6nke Ludwig  =

<sludwig outerproduct.org> wrote:

 Am 11.11.2012 19:46, schrieb Alex R=F8nne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear tha=


t
 shared is for documentation purposes and nothing else, or, figure out=


 an
 alternative system to shared, because I don't see shared actually bei=


ng
 useful for real world work no matter what we do with it.

 After reading Walter's comment, it suddenly seemed obvious that we are=

 currently using 'shared' the wrong way. Shared is just not meant to be=

 used on objects at all (or only in some special cases like
 synchronization primitives). I just experimented a bit with a statical=

ly
 checked library based solution and a nice way to use shared is to only=

 use it for disabling access to non-shared members while its monitor is=

 not locked. A ScopedLock proxy and a lock() function can be used for  =

 this:

I had exactly the same idea:
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos=
t-op.wnnsrds954xghj:40puck.auriga.bhead.co.uk

But, then I went right back the other way:
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=3D2#pos=
t-op.wnnt4iyz54xghj:40puck.auriga.bhead.co.uk

I think we can definitely create a library solution like the one you  =

propose below, and it should work quite well.  But, I reckon it would be=
  =

even nicer if the compiler did just a little bit of the work for us, and=
  =

we integrated with the built in synchronized statement. :)

R

-- =

Using Opera's revolutionary email client: http://www.opera.com/mail/

Nov 12 2012

=?ISO-8859-15?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:

Am 12.11.2012 13:33, schrieb Regan Heath:
On Mon, 12 Nov 2012 11:41:00 -0000, S�nke Ludwig
<sludwig outerproduct.org> wrote:

Am 11.11.2012 19:46, schrieb Alex R�nne Petersen:
Something needs to be done about shared. I don't know what, but the
current situation is -- and I'm really not exaggerating here --
laughable. I think we either need to just make it perfectly clear that
shared is for documentation purposes and nothing else, or, figure out an
alternative system to shared, because I don't see shared actually being
useful for real world work no matter what we do with it.

After reading Walter's comment, it suddenly seemed obvious that we are
currently using 'shared' the wrong way. Shared is just not meant to be
used on objects at all (or only in some special cases like
synchronization primitives). I just experimented a bit with a statically
checked library based solution and a nice way to use shared is to only
use it for disabling access to non-shared members while its monitor is
not locked. A ScopedLock proxy and a lock() function can be used for
this:

I had exactly the same idea:
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=2#post-op.wnnsrds954xghj:40puck.auriga.bhead.co.uk

But, then I went right back the other way:
http://forum.dlang.org/thread/k7orpj$1tt5$1 digitalmars.com?page=2#post-op.wnnt4iyz54xghj:40puck.auriga.bhead.co.uk

I think we can definitely create a library solution like the one you
propose below, and it should work quite well. But, I reckon it would be
even nicer if the compiler did just a little bit of the work for us, and
we integrated with the built in synchronized statement. :)

The only problem is that for this approach to be safe, any aliasing
outside of the object's reference tree that is not 'shared', must be
disallowed. To get the maximum use out of this, some kind of
'isolated'/'unique' qualifier is needed again.

So a built-in language solution - which would definitely be highly
desirable - that allows this would also either have to introduce a new
type qualifier, or recognize the corresponding library structure which
implements this. Since for various reasons both possibilities have a
questionable probability of being implemented, I decided to go and see
what can be done with the current state. By now I would be more than
happy to have _any_ decent solution that works and that can also be
recommend to others.

Nov 12 2012

deadalnix <deadalnix gmail.com> writes:

Le 12/11/2012 12:41, S�nke Ludwig a �crit :
 Am 11.11.2012 19:46, schrieb Alex R�nne Petersen:
 Something needs to be done about shared. I don't know what, but the
 current situation is -- and I'm really not exaggerating here --
 laughable. I think we either need to just make it perfectly clear that
 shared is for documentation purposes and nothing else, or, figure out an
 alternative system to shared, because I don't see shared actually being
 useful for real world work no matter what we do with it.

 After reading Walter's comment, it suddenly seemed obvious that we are
 currently using 'shared' the wrong way. Shared is just not meant to be
 used on objects at all (or only in some special cases like
 synchronization primitives). I just experimented a bit with a statically
 checked library based solution and a nice way to use shared is to only
 use it for disabling access to non-shared members while its monitor is
 not locked. A ScopedLock proxy and a lock() function can be used for this:

 ---
 class MyClass {
 	void method();
 }

 void main()
 {
 	auto inst = new shared(MyClass);
 	//inst.method(); // forbidden
 	
 	{
 		ScopedLock!MyClass l = lock(inst);
 		l.method(); // now allowed as long as 'l' is in scope
 	}

 	// can also be called like this:
 	inst.lock().method();
 }
 ---

 ScopedLock is non-copyable and handles the dirty details of locking and
 casting away 'shared' when its safe to do so. No tagging of the class
 with 'synchronized' or 'shared' needs to be done and everything works
 nicely without casts.

 This comes with a restriction, though. Doing all this is only safe as
 long as the instance is known to not contain any unisolated aliasing*.
 So use would be restricted to types that contain only immutable or
 unique/isolated references.

 So I also implemented an Isolated!(T) type that is recognized by
 ScopedLock, as well as functions such as spawn(). The resulting usage
 can be seen in the example at the bottom.

 It doesn't provide all the flexibility that a built-in 'isolated' type
 would do, but the possible use cases at least look interesting. There
 are still some details to be worked out, such as writing a spawn()
 function that correctly moves Isolated!() parameters instead of copying
 or the forward reference error mentioned in the example.

 I'll now try and see if some of my earlier multi-threading designs fit
 into this system.

 ---
 import std.stdio;
 import std.typecons;
 import std.traits;
 import stdx.typecons;

 class SomeClass {

 }

 class Test {
 	private {
 		string m_test1 = "test 1";
 		Isolated!SomeClass m_isolatedReference;
 		// currently causes a size forward reference error:
 		//Isolated!Test m_next;
 	}

 	this()
 	{
 		//m_next = ...;
 	}

 	void test1() const { writefln(m_test1); }
 	void test2() const { writefln("test 2"); }
 }

 void main()
 {
 	writefln("Shared locking");
 	// create a shared instance of Test - no members will
 	// be accessible
 	auto t = new shared(Test);
 	{
 		// temporarily lock t to make all non-shared members
 		// safely available
 		// lock() words only for objects with no unisolated
 		// aliasing.
 		ScopedLock!Test l = lock(t);
 		l.test1();
 		l.test2();
 	}

 	// passing a shared object to a different thread works as usual
 	writefln("Shared spawn");
 	spawn(&myThreadFunc1, t);

 	// create an isolated instance of Test
 	// currently, Test may not contain unisolated aliasing, but
 	// this requirement may get lifted,
 	// as long as only pure methods are called
 	Isolated!Test u = makeIsolated!Test();

 	// move ownership to a different function and recover
 	writefln("Moving unique");
 	Isolated!Test v = myThreadFunc2(u.move());

 	// moving to a different thread also works
 	writefln("Moving unique spawn");
 	spawn(&myThreadFunc2, v.move());

 	// another possibility is to convert to immutable
 	auto w = makeIsolated!Test();
 	writefln("Convert to immutable spawn");
 	spawn(&myThreadFunc3, w.freeze());

 	// or just loose the isolation and act on the base type
 	writefln("Convert to mutable");
 	auto x = makeIsolated!Test();
 	Test xm = x.extract();
 	xm.test1();
 	xm.test2();
 }

 void myThreadFunc1(shared(Test) t)
 {
 	// call non-shared method on shared object
 	t.lock().test1();
 	t.lock().test2();
 }

 Isolated!Test myThreadFunc2(Isolated!Test t)
 {
 	// call methods as usual on an isolated object
 	t.test1();
 	t.test2();
 	return t.move();
 }

 void myThreadFunc3(immutable(Test) t)
 {
 	t.test1();
 	t.test2();
 }


 // fake spawn function just to test the type constraints
 void spawn(R, ARGS...)(R function(ARGS) func, ARGS args)
 {
 	foreach( i, T; ARGS )
 		static assert(!hasUnisolatedAliasing!T ||
 			!hasUnsharedAliasing!T,
 			"Parameter "~to!string(i)~" of type"
 			~T.stringof~" has unshared or unisolated
 			 aliasing. Cannot safely be passed to a
 			different thread.");
 	
 	// TODO: do this in a different thread...
 	// TODO: don't cheat with the 1-parameter move detection
 	static if(__traits(compiles, func(args[0])) ) func(args);
 	else func(args[0].move());
 }
 ---


 * shared aliasing would also be OK, but this is not yet handled by the
 implementation.

With some kind of ownership in the type system, it can me made automagic 
that shared is casted away on synchronized object.

Nov 12 2012

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:

Am 12.11.2012 14:00, schrieb deadalnix:
 
 With some kind of ownership in the type system, it can me made automagic
 that shared is casted away on synchronized object.

Yes and I would love to have that, but I fear that we then basically get
where Bartosz Milewski was at the end of his research. And unfortunately
that went too far to be considered for (mid-term) inclusion.

Besides its shortcomings, there are also actually some advantages to a
library based solution. For example it could be allowed to customize the
lock()/unlock() function so that locking could work for fiber-aware
mutexes (e.g. http://vibed.org/api/vibe.core.mutex/ ...) or even for
network based distributed object systems.

Nov 12 2012

deadalnix <deadalnix gmail.com> writes:

Le 12/11/2012 14:23, S�nke Ludwig a �crit :
 Am 12.11.2012 14:00, schrieb deadalnix:
 With some kind of ownership in the type system, it can me made automagic
 that shared is casted away on synchronized object.

 Yes and I would love to have that, but I fear that we then basically get
 where Bartosz Milewski was at the end of his research. And unfortunately
 that went too far to be considered for (mid-term) inclusion.

 Besides its shortcomings, there are also actually some advantages to a
 library based solution. For example it could be allowed to customize the
 lock()/unlock() function so that locking could work for fiber-aware
 mutexes (e.g. http://vibed.org/api/vibe.core.mutex/ ...) or even for
 network based distributed object systems.

Don't get me started on fibers /D

Nov 12 2012

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:

I generated some quick documentation with examples here:

http://vibed.org/temp/d-isolated-test/stdx/typecons/lock.html
http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolated.html
http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolatedArray.html

It does offer some nice improvements. No single cast and everything is
statically checked.

Nov 12 2012

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <sludwig outerproduct.org> writes:

Am 12.11.2012 16:27, schrieb S�nke Ludwig:
 I generated some quick documentation with examples here:
 
 http://vibed.org/temp/d-isolated-test/stdx/typecons/lock.html
 http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolated.html
 http://vibed.org/temp/d-isolated-test/stdx/typecons/makeIsolatedArray.html
 
 It does offer some nice improvements. No single cast and everything is
statically checked.
 

All examples compile now. Put everything on github for reference:

https://github.com/s-ludwig/d-isolated-test

Nov 12 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Thursday, November 15, 2012 04:12:47 Andrej Mitrovic wrote:
 On 11/15/12, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 From what I recall of what TDPL says

 
 It says (on p.413) reading and writing shared values are guaranteed to
 be atomic, for pointers, arrays, function pointers, delegates, class
 references, and struct types containing exactly one of these types.
 Reals are not supported.
 
 It also talks about automatically inserting memory barriers on page 414.

Good to know, but none of that really has anything to do with the casting, 
which is what I was responding to. And looking at that list, it sounds 
reasonable that all of that would be guaranteed to be atomic, but I think that 
the fundamental problem that's affecting usability is all of the casting that's 
typically required. And I don't see any way around that other than writing 
code that doesn't need to pass shared objects around or using templates very 
heavily.

- Jonathan M Davis

Nov 14 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, November 15, 2012 14:32:47 Manu wrote:
 On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 I don't really see the difference, other than, as you say, the cast is
 explicit.
 Obviously the possibility for the situation you describe exists, it's
 equally possible with the cast, except this way, the usage pattern is made
 more convenient, the user has a convenient way to control the locks and
 most importantly, it would work with templates.
 That said, this sounds like another perfect application of 'scope'. Perhaps
 only scope parameters can receive a locked, shared thing... that would
 mechanically protect you against escape.

You could make casting away const implicit too, which would make some code 
easier, but it would be a disaster, because the programer wouldn't have a clue 
that it's happening in many cases, and the code would end up being very, very 
wrong. Implicitly casting away shared would put you in the same boat. _Maybe_ 
you could get away with it in very restricted circumstances where both pure 
and scope are being used, but then it becomes so restrictive that it's nearly 
useless anyway. And again, it would be hidden from the programmer, when this 
is something that _needs_ to be explicit. Having implicit locks happen on you 
could really screw with any code trying to do explicit locks, as would be 
needed anyway in all but the most basic cases.

 2. It's often the case that you need to lock/unlock groups of stuff together
 such that locking specific variables is of often of limited use and would
 just
 introduce pointless extra locks when dealing with multiple variables. It
 would
 also increase the risk of deadlocks, because you wouldn't have much - if
 any -
 control over what order locks were acquired in when dealing with multiple
 shared variables.

 
 Your fear is precisely the state we're in now, except it puts all the work
 on the user to create and use the synchronisation objects, and also to
 assert that things are locked when they are accessed.
 I'm just suggesting some reasonably simple change that would make the
 situation more usable and safer immediately, short of waiting for all these
 fantastic designs being discussed having time to simmer and manifest.

Except that with your suggestion, you're introducing potential deadlocks which 
are outside of the programmer's control, and you're introducing extra overhead 
with those locks (both in terms of memory and in terms of the runtime costs). 
Not to mention, it would probably cause all kinds of issues for something like 
shared int* to have a mutex with it, because then its size is completely 
different from int*. It also would cause even worse problems when that shared 
int* was cast to int* (aside from the size issues), because all of the locking 
that was happening for the shared int* was invisible. If you want automatic 
locks, then use synchronized classes. That's what they're for.

Honestly, I really don't buy into the idea that it makes sense for shared to 
magically make multi-threaded code work without the programmer worrying about 
locks. Making it so that it's well-defined as to what's atomic is great for 
code that has any chance of being lock-free, but it's still up to the 
programmer to understand when locks are and aren't needed and how to use them 
correctly. I don't think that it can possibly work for it to be automatic. 
It's far to easy to introduce deadlocks, and it would only work in the 
simplest of cases anyway, meaning that the programmer needs to understand and 
properly solve the issues anyway. And if the programmer has to understand it 
all to get it right, why bother adding the extra overhead and deadlock 
potential caused by automatically locking anything? D provides some great 
synchronization primitives. People should use them.

I think that the only things that share really needs to be solving are:

1. Indicating to the compiler via the type system that the object is not 
thread-local. This properly segregates shared and unshared code and allows the 
compiler to take advantage of thread locality for optimizations and avoid 
optimizations with shared code that screw up threading (e.g. double-checked 
locking won't work if the compiler does certain optimizations).

2. Making it explicit and well-defined as part of the language which operations 
can assumed to be atomic (even if it that set of operations is very small, 
having it be well-defined is valuable).

3. Ensuring sequential consistency so that it's possible to do lock-free code 
when atomic operations permit it and so that there are fewer weird issues due 
to undefined behavior.

- Jonathan M Davis

Nov 15 2012

Manu <turkeyman gmail.com> writes:

On 15 November 2012 15:00, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 On Thursday, November 15, 2012 14:32:47 Manu wrote:
 On 15 November 2012 13:38, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 I don't really see the difference, other than, as you say, the cast is
 explicit.
 Obviously the possibility for the situation you describe exists, it's
 equally possible with the cast, except this way, the usage pattern is

 made
 more convenient, the user has a convenient way to control the locks and
 most importantly, it would work with templates.
 That said, this sounds like another perfect application of 'scope'.

 Perhaps
 only scope parameters can receive a locked, shared thing... that would
 mechanically protect you against escape.

 You could make casting away const implicit too, which would make some code
 easier, but it would be a disaster, because the programer wouldn't have a
 clue
 that it's happening in many cases, and the code would end up being very,
 very
 wrong. Implicitly casting away shared would put you in the same boat.


... no, they're not even the same thing. const things can not be changed.
Shared things are still mutable things, and perfectly compatible with other
non-shared mutable things, they just have some access control requirements.

_Maybe_ you could get away with it in very restricted circumstances where
 both pure
 and scope are being used, but then it becomes so restrictive that it's
 nearly
 useless anyway. And again, it would be hidden from the programmer, when
 this
 is something that _needs_ to be explicit. Having implicit locks happen on
 you
 could really screw with any code trying to do explicit locks, as would be
 needed anyway in all but the most basic cases.

I think you must have misunderstood my suggestion, I certainly didn't
suggest locking would be implicit.
All locks would be explicit, all I suggested is that shared things would
gain an associated mutex, and an implicit assert that said mutex is locked
whenever it is accessed, rather than deny assignment between
shared/unshared things.

You could use lock methods, or a nice alternative would be to submit them
to some sort of synchronised scope like luka illustrates.

I'm of the opinion that for the time being, explicit lock control is
mandatory (anything else is a distant dream), and atomic primitives may not
be relied upon.

 2. It's often the case that you need to lock/unlock groups of stuff
 together
 such that locking specific variables is of often of limited use and


 would
 just
 introduce pointless extra locks when dealing with multiple variables.


 It
 would
 also increase the risk of deadlocks, because you wouldn't have much -


 if
 any -
 control over what order locks were acquired in when dealing with


 multiple
 shared variables.

 Your fear is precisely the state we're in now, except it puts all the

 work
 on the user to create and use the synchronisation objects, and also to
 assert that things are locked when they are accessed.
 I'm just suggesting some reasonably simple change that would make the
 situation more usable and safer immediately, short of waiting for all

 these
 fantastic designs being discussed having time to simmer and manifest.

 Except that with your suggestion, you're introducing potential deadlocks
 which
 are outside of the programmer's control, and you're introducing extra
 overhead
 with those locks (both in terms of memory and in terms of the runtime
 costs).
 Not to mention, it would probably cause all kinds of issues for something
 like
 shared int* to have a mutex with it, because then its size is completely
 different from int*. It also would cause even worse problems when that
 shared
 int* was cast to int* (aside from the size issues), because all of the
 locking
 that was happening for the shared int* was invisible. If you want automatic
 locks, then use synchronized classes. That's what they're for.

 Honestly, I really don't buy into the idea that it makes sense for shared
 to
 magically make multi-threaded code work without the programmer worrying
 about
 locks. Making it so that it's well-defined as to what's atomic is great for
 code that has any chance of being lock-free, but it's still up to the
 programmer to understand when locks are and aren't needed and how to use
 them
 correctly. I don't think that it can possibly work for it to be automatic.
 It's far to easy to introduce deadlocks, and it would only work in the
 simplest of cases anyway, meaning that the programmer needs to understand
 and
 properly solve the issues anyway. And if the programmer has to understand
 it
 all to get it right, why bother adding the extra overhead and deadlock
 potential caused by automatically locking anything? D provides some great
 synchronization primitives. People should use them.

To all above:
You've completely misunderstood my suggestion. It's basically the same as
luka.
It's not that hard, shared just assists the user do what they do anyway by
associating a lock primitive, and implicitly assert it is locked when
accessed.
No magic should be performed on the users behalf.

I think that the only things that share really needs to be solving are:
 1. Indicating to the compiler via the type system that the object is not
 thread-local. This properly segregates shared and unshared code and allows
 the
 compiler to take advantage of thread locality for optimizations and avoid
 optimizations with shared code that screw up threading (e.g. double-checked
 locking won't work if the compiler does certain optimizations).

 2. Making it explicit and well-defined as part of the language which
 operations
 can assumed to be atomic (even if it that set of operations is very small,
 having it be well-defined is valuable).

 3. Ensuring sequential consistency so that it's possible to do lock-free
 code
 when atomic operations permit it and so that there are fewer weird issues
 due
 to undefined behavior.

 - Jonathan M Davis

Nov 15 2012

"Mehrdad" <wfunction hotmail.com> writes:

Would it be useful if 'shared' in D did something like 'volatile' 
in C++ (as in, Andrei's article on volatile-correctness)?
http://www.drdobbs.com/cpp/volatile-the-multithreaded-programmers-b/184403766

Nov 15 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Something needs to happen with shared, and soon.