digitalmars.D - Some missing things in the current threading implementation

=?ISO-8859-15?Q?S=F6nke_Ludwig?= (233/233) Sep 12 2010 Recently I thought it would be a good idea to try out the new

dsimcha (50/142) Sep 12 2010 std.concurrency takes the point of view that simplicity and safety shoul...

=?ISO-8859-15?Q?S=F6nke_Ludwig?= (38/114) Sep 14 2010 Good to know that there are already bug reports. I remember the

Michel Fortin (37/80) Sep 12 2010 I must say I agree with most of your observations. Here are some comment...

=?ISO-8859-1?Q?S=F6nke_Ludwig?= (15/48) Sep 14 2010 If the language allows for creating an array, splitting it and

dsimcha (11/25) Sep 15 2010 I thought about making a safe std.parallelism.map(). (There's currently...

Robert Jacques (14/19) Sep 12 2010 Unique (or for those with an Occam background 'mobile') has several

=?ISO-8859-15?Q?S=F6nke_Ludwig?= (6/26) Sep 14 2010 Unique indeed seems to be a complicated problem when you want to make it...

=?ISO-8859-15?Q?S=F6nke_Ludwig?= <ludwig informatik.uni-luebeck.de> writes:

Recently I thought it would be a good idea to try out the new 
concurrency system once again. Some time back, when 'shared' was still 
new, I already tried it several times but since it was completely 
unusable I gave up on it at that time (and as it seems, many others also 
did this).

Now, however, after TDPL has been released and there is some 
documentation + std.concurrency, the system should be in a state where 
it is actually useful and only some bugs should be there to fix - which 
does not include inherent system changes. The reality is quite different 
once you step anywhere beside the already walked path (defined by the 
book examples and similar things).

Just for the record, I've done a lot with most kinds of threading 
schemes (even if the only lockless thing I implemented was a simple 
Shared/WeakPtr implementation *shiver*). This may very well have the 
effect that there are some patterns burned into my head that somehow 
clash with some ideas behind the current system. But, for most of the 
points, I am quite sure that there is no viable alternative if 
performance and memory consumption should be anywhere new the optimum.

I apologize for the length of this post, although I already tried to 
make it as short as possible and left out a lot of details. Also it is 
very possible that I assume some false things about the concurrency 
implementation because my knowledge is mostly based only on the NG and 
the book chapter.

The following problems are those that I found during a one day endeavor 
to convert some parts of my code base to spawn/shared (not really 
successful, partly because of the very viral nature of shared).


1. spawn and objects

	Spawn only supports 'function' + some bound parameters. Since taking 
the address of an object method in D always yields a delegate, it is not 
possible to call class members without a static wrapper function. This 
can be quite disturbing when working object oriented (C++ obviously has 
the same problem).


2. error messages

	Right now, error messages just state that there is a shared/unshared 
mismatch somewhere. For a non-shared-expert, this can be a real bummer. 
You have to know a lot of implications 'shared' has to be able to 
correctly interpret these messages and track down the cause. Not very 
good for a feature that is meant to make threading easier.


3. everything in implicit

	This may seem kind of counter-intuitive, but using 'synchronized' 
classes and features like setSameMutex - which are deadly necessary, it 
is stupid to neglect the importance of lock based threading in an object 
oriented environment - creates a feeling of climbing without a safety 
rope. Not stating how you really want to synchronize/lock and not being 
able to directly read  from the code how this is really done just leaves 
a black-box feeling. This in turn means threading newcomers will not be 
educated, they just use the system somehow and it magically works. But 
as soon as you get problems such as deadlocks, you suddenly have to 
understand the details and in this moment you have to read up and 
remember everything that is going on in the background - plus everything 
you would have to know about threading/synchronization in C. I'm not 
sure if this is the right course here or if there is any better one.


4. steep learning curve - more a high learning wall to climb on

	Resulting from the first points, my feeling tells me that a newcomer, 
who has not followed the discussions and thoughts about the system here, 
will see himself standing before a very high barrier of material to 
learn, before he can actually put anything of it to use. Also I imagine 
this to be a very painful process because of all the things that you 
discover are not possible or those error messages that potentially make 
you banging your head against the wall.

	
5. advanced synchronization primitives need to be considered

	Things such as core.sync.condition (the most important one) need to be 
considered in the 'shared'-system. This means there needs to be a 
condition variable that takes a shared object instead of a mutex or you 
have to be able to query an objects mutex.

	
6. temporary unlock

	There are often situations when you do lock-based programming, in which 
you need to temporarily unlock your mutex, perform some time consuming 
external task (disk i/o, ...) and then reaquire the mutex. For this 
feature, which is really important also because it is really difficult 
and dirty to work around it, needs language support, could be something 
like the inverse of a synchronized {} scope or the possibility to define 
a special kind of private member function that unlocks the mutex. Then, 
inside whose blocks the compiler of course has to make sure that the 
appropriate access rules are not broken (could be as conservative as 
disallowing access to any class member).

	
7. optimization of pseudo-shared objects

	Since the sharability/'synchronized' of an object is already decided at 
class definition time, for performance reasons it should be possible to 
somehow disable the mutex of those instances that are only used thread 
locally. Maybe it should be necessary to declare objects as "shared C 
c;" even if the class is defined as "synchronized class C {}" or you 
will get an object without a mutex which is not shared?

	
8. very strong split of shared and non-shared worlds

	For container classes in particular it is really nasty that you have to 
define two versions of the container, one shared and the other 
non-shared if you want to be able to use it in both contexts and be able 
to put non-shared objects in it in a non-shared context. Also there 
should really be a way to declare a class to be hygienic in a way 
similar to pure, so that it would be possible to allow it to be used in 
a synchronized context and store shared objects, although it is not 
shared itself.

	
9. unique

	Unique objects or chunks of data are really important not only to be 
able to check that a cast to 'immutable' is correct, but also to allow 
for passing objects to another thread for computations without making a 
superfluous copy or doing superfluous computation.

	
10. recursive locking

	The only option right now is to have mutexes behave recursively. This 
is good to easily avoid deadlocks in the same thread. However, in my 
experience they are very dangerous because typically no one takes into 
account what happens when an algorithm is invoked recursively from the 
middle of its computation. This can happen easily in a threaded 
environment where you often use signals/slots or message passing. A 
deadlock or at least an assertion in debug mode is a good indicator in 
90% of the situations that there just happened something that should 
not. Of course objects with shared mutexes are a different matter - in 
this case you actually need to have an ownership relation to do anything 
useful with non-recursive mutexes.

	
11. holes in the system

	It seems like there are a lot of ways in which you can still slip in 
non-shared data into a shared context.

	One example is that you can pass a shared array
	---
		void fnc(int[] arr);
		void fnc2(){
			shared int[] arr;
			spawn(&fnc, arr);
		}
	---
	
	compiles. This is just a bug and probably easy to fix but what about:
	
	---
		class C {
			private void method();
			private void method2(){
				spawn( void function(C inst){ inst.method(); }, this );
			}
		}
	---
	
	unless private functions to recursive locking (which in turn is usually 
useless overhead), method() will be invoked in a completely unprotected 
context. Tthis one has to be fixed somehow in the language. I'm sure 
there are other things like these.

	
12. more practical examples need to be considered

	It seems right now, that all the examples, that are used to explore the 
features needed in the system, are somehow of a very academical nature. 
Either the most simple i/o or pure functional comptation, maybe a 
network protocol. However, when it comes to practical high performance 
computation on real systems, where memory consumption and low-level 
performance can really matter, there seems to be quite some no-mans-land 
here.
	
	Here some simple examples where I immediately came to a grinding halt:
	
	I. A an object loader with background processing
	
		You have a shared class Loader which uses multiple threads to load 
objects on demand and then fires a signal or returns from its 
loadObject(x) method.
		
		The problem is that the actual loading of an object must happen 
outside of a synchronized region of the loader or you get no parallelism 
out of this. Also, you have to use an external function because of 
'spawn' instead of being able to directly use a member function. 
Fortunately in this case this is also the solution. Defining an external 
function, that takes the arguments needed to load the object, loading 
it, and then passing it back to the class.
		Waiting for finished objects can be implemented using message passing 
without worry here because the MP overhead is probably low enough.
		
		Features missing:
			- spawn with methods
			- temporary unlock
			
	II. Implementation of a ThreadPool
	
		The majority of applications can very well be broken up into small 
chunks of work that can be processed in parallel. Instead of using a 
costly thread-create, run task, thread-destroy cycle, it would be wise 
to reuse the threads for later tasks. The implementation of a thread 
pool that does this is of course a low-level thing and you could argue 
that it is ok to use some casts and such stuff here. Anyway, there are 
quite some things missing here.
		
		Features Missing:
			- spawn with methods
			- temporary unlock
			- condition variables (message passing too slow + you need to manage 
destinations)

	III. multiple threads computing separate parts of an array
		
		Probably the most simple form of parallelism is to perform similar 
operations on each element of an array (or similar things on regions of 
the array) and to do this in separate threads.
		The good news is that this works in the current implementation. The 
bad news is that this is really slow because you have to use atomic 
operations on the elements or it is unsafe and prone to low-level races. 
Right now the compiler checks almost nothing.
		The alternative would be to pass unique
		
		To illustrate the current situation, this compiles and runs:

		---
			import std.concurrency;
			import std.stdio;

			void doCompute(size_t offset, int[] arr){ // arr should be shared
				foreach(i, ref el; arr){
					el *= 2; // should be an atomic operation, which would make this 
useless because of the performance penalty
					writefln("Thread %s computed element %d: %d", thisTid(), i + 
offset, cast(int)el);
				}
			}

			void waitForThread(Tid thread){
				// TODO: implement in some complex way using messages or maybe there 
is a simple function for this
			}

			void main(){
				shared int[] myarray = [1, 2, 3, 4];
				Tid[2] threads;
				foreach( i, ref t; threads )
					t = spawn(&doCompute, i, myarray[i .. i+3]); // should error out 
because the slice is not shared
				foreach( t; threads )
					waitForThread(t);
			}
		---
		
		Features missing:
			- unique
			- some way to safely partition/slice an array and get a set of still 
unique slices


- S�nke

Sep 12 2010

dsimcha <dsimcha yahoo.com> writes:

== Quote from S�nke_Ludwig (ludwig informatik.uni-luebeck.de)'s article
 Now, however, after TDPL has been released and there is some
 documentation + std.concurrency, the system should be in a state where
 it is actually useful and only some bugs should be there to fix - which
 does not include inherent system changes. The reality is quite different
 once you step anywhere beside the already walked path (defined by the
 book examples and similar things).

std.concurrency takes the point of view that simplicity and safety should come
first, and performance and flexibility second.  I thoroughly appreciate this
post,
as it gives ideas for either improving std.concurrency or creating alternative
models.

 I apologize for the length of this post, although I already tried to
 make it as short as possible and left out a lot of details.

No need to apologize, I think it's great that you're willing to put this much
effort into it.

 1. spawn and objects
 	Spawn only supports 'function' + some bound parameters. Since taking
 the address of an object method in D always yields a delegate, it is not
 possible to call class members without a static wrapper function. This
 can be quite disturbing when working object oriented (C++ obviously has
 the same problem).

Except in the case of an immutable or shared object this would be unsafe, as it
would allow implicit sharing.  I do agree, though, that delegates need to be
allowed if they're immutable or shared delegates.  Right now taking the address
of
a shared/immutable member function doesn't yield a shared/immutable delegate.
There are bug reports somewhere in Bugzilla on this.

 2. error messages
 	Right now, error messages just state that there is a shared/unshared
 mismatch somewhere. For a non-shared-expert, this can be a real bummer.
 You have to know a lot of implications 'shared' has to be able to
 correctly interpret these messages and track down the cause. Not very
 good for a feature that is meant to make threading easier.

Agreed.  Whenever you run into an unreasonably obtuse error message, a bug
report
would be appreciated.  Bug reports related to wrong or extremely obtuse error
messages are considered "real", though low priority, bugs around here.

 4. steep learning curve - more a high learning wall to climb on
 	Resulting from the first points, my feeling tells me that a newcomer,
 who has not followed the discussions and thoughts about the system here,
 will see himself standing before a very high barrier of material to
 learn, before he can actually put anything of it to use. Also I imagine
 this to be a very painful process because of all the things that you
 discover are not possible or those error messages that potentially make
 you banging your head against the wall.

True, but I think this is just a fact of life when dealing with concurrency in
general.  Gradually (partly due to the help of people like you pointing out the
relevant issues) the documentation, etc. will improve.

 5. advanced synchronization primitives need to be considered
 	Things such as core.sync.condition (the most important one) need to be
 considered in the 'shared'-system. This means there needs to be a
 condition variable that takes a shared object instead of a mutex or you
 have to be able to query an objects mutex.

The whole point of D's flagship concurrency model is that you're supposed to use
message passing for most things.  Therefore, lock-based programming is kind of
half-heartedly supported.  It sounds like you're looking for a low-level model
(which is available via core.thread and core.sync, though it isn't the flagship
model).  std.concurrency is meant to be a high-level model useful for simple,
safe
everyday concurrency, not the **only** be-all-and-end-all model of
multithreading
in D.

 6. temporary unlock
 	There are often situations when you do lock-based programming, in which
 you need to temporarily unlock your mutex, perform some time consuming
 external task (disk i/o, ...) and then reaquire the mutex. For this
 feature, which is really important also because it is really difficult
 and dirty to work around it, needs language support, could be something
 like the inverse of a synchronized {} scope or the possibility to define
 a special kind of private member function that unlocks the mutex. Then,
 inside whose blocks the compiler of course has to make sure that the
 appropriate access rules are not broken (could be as conservative as
 disallowing access to any class member).

Again, the point of std.concurrency is to be primarily message passing-based. 
It
really sounds like what you want is a lower-level model.  Again, it's available,
but it's not considered the flagship model.

 7. optimization of pseudo-shared objects
 	Since the sharability/'synchronized' of an object is already decided at
 class definition time, for performance reasons it should be possible to
 somehow disable the mutex of those instances that are only used thread
 locally. Maybe it should be necessary to declare objects as "shared C
 c;" even if the class is defined as "synchronized class C {}" or you
 will get an object without a mutex which is not shared?

Agreed.  IMHO locks should only be taken on a synchronized object if its
compile-time type is shared.  Casting away shared should result in locks not
being
used.

 9. unique
 	Unique objects or chunks of data are really important not only to be
 able to check that a cast to 'immutable' is correct, but also to allow
 for passing objects to another thread for computations without making a
 superfluous copy or doing superfluous computation.

A Unique type is in std.typecons.  I don't know how well it currently works,
but I
agree that we need a way to express uniqueness to make creating immutable data
possible.

 11. holes in the system
 	It seems like there are a lot of ways in which you can still slip in
 non-shared data into a shared context.
 	One example is that you can pass a shared array
 	---
 		void fnc(int[] arr);
 		void fnc2(){
 			shared int[] arr;
 			spawn(&fnc, arr);
 		}
 	---
 	compiles. This is just a bug and probably easy to fix but what about:

Definitely just a bug.

 	---
 		class C {
 			private void method();
 			private void method2(){
 				spawn( void function(C inst){ inst.method(); }, this );
 			}
 		}
 	---

Just tested this, and it doesn't compile.

 	II. Implementation of a ThreadPool
 		The majority of applications can very well be broken up into small
 chunks of work that can be processed in parallel. Instead of using a
 costly thread-create, run task, thread-destroy cycle, it would be wise
 to reuse the threads for later tasks. The implementation of a thread
 pool that does this is of course a low-level thing and you could argue
 that it is ok to use some casts and such stuff here. Anyway, there are
 quite some things missing here.

My std.parallelism module that's currently being reviewed for inclusion in
Phobos
has a thread pool and task parallelism, though it is completely unsafe (i.e. it
allows implicit sharing and will not be allowed in  safe code).  std.concurrency
was simply not designed for pull-out-all-stops parallelism, and
pull-out-all-stops
parallelism is inherently harder than basic concurrency to make safe.  I've
given
up making most std.parallelism safe, but I think I may be able to make a few
islands of it safe.  The question is whether those islands would allow enough
useful things to be worth the effort.  See the recent safe asynchronous function
calls thread.  Since it sounds like you need something like this, I'd sincerely
appreciate your comments on this module.  The docs are at:

http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

Code is at:

http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d


 	III. multiple threads computing separate parts of an array
 		Probably the most simple form of parallelism is to perform similar
 operations on each element of an array (or similar things on regions of
 the array) and to do this in separate threads.
 		The good news is that this works in the current implementation. The
 bad news is that this is really slow because you have to use atomic
 operations on the elements or it is unsafe and prone to low-level races.
 Right now the compiler checks almost nothing.

Also in the proposed std.parallelism module, though completely unsafe because it
needs to be fast.

Sep 12 2010

=?ISO-8859-15?Q?S=F6nke_Ludwig?= <ludwig informatik.uni-luebeck.de> writes:

 1. spawn and objects
 	Spawn only supports 'function' + some bound parameters. Since taking
 the address of an object method in D always yields a delegate, it is not
 possible to call class members without a static wrapper function. This
 can be quite disturbing when working object oriented (C++ obviously has
 the same problem).

 Except in the case of an immutable or shared object this would be unsafe, as it
 would allow implicit sharing.  I do agree, though, that delegates need to be
 allowed if they're immutable or shared delegates.  Right now taking the
address of
 a shared/immutable member function doesn't yield a shared/immutable delegate.
 There are bug reports somewhere in Bugzilla on this.

Good to know that there are already bug reports. I remember the 
discussion about allowing shared(delegate) or immutable(delegate) and 
this would be a possible solution. However, I still find the idea that 
those attributes are bound to the delegate type awkward and wrong, as a 
delegate is typically supposed to hide away the internally used 
objects/state and this is just a special case for direct member function 
delegates (what about an inline (){ obj.method(); }?). Also const is not 
part of the delegate and does not have to be because it can be checked 
at delegate creation time. But this is probably a topic on its own.

 2. error messages
 	Right now, error messages just state that there is a shared/unshared
 mismatch somewhere. For a non-shared-expert, this can be a real bummer.
 You have to know a lot of implications 'shared' has to be able to
 correctly interpret these messages and track down the cause. Not very
 good for a feature that is meant to make threading easier.

 Agreed.  Whenever you run into an unreasonably obtuse error message, a bug
report
 would be appreciated.  Bug reports related to wrong or extremely obtuse error
 messages are considered "real", though low priority, bugs around here.

I will definitely file bug reports when I continue in this area, I just 
wanted to stress how important the error messages are in this part of 
the language, because the root cause is most often very non-obvious 
compared to other type-system errors.

 4. steep learning curve - more a high learning wall to climb on
 	Resulting from the first points, my feeling tells me that a newcomer,
 who has not followed the discussions and thoughts about the system here,
 will see himself standing before a very high barrier of material to
 learn, before he can actually put anything of it to use. Also I imagine
 this to be a very painful process because of all the things that you
 discover are not possible or those error messages that potentially make
 you banging your head against the wall.

 True, but I think this is just a fact of life when dealing with concurrency in
 general.  Gradually (partly due to the help of people like you pointing out the
 relevant issues) the documentation, etc. will improve.

...plus that even for someone who is already experienced with threading 
in other langugages, there is a lot to know now in D if you go the 
shared path instead of the C++/__gshared path.

 5. advanced synchronization primitives need to be considered
 	Things such as core.sync.condition (the most important one) need to be
 considered in the 'shared'-system. This means there needs to be a
 condition variable that takes a shared object instead of a mutex or you
 have to be able to query an objects mutex.

 The whole point of D's flagship concurrency model is that you're supposed to
use
 message passing for most things.  Therefore, lock-based programming is kind of
 half-heartedly supported.  It sounds like you're looking for a low-level model
 (which is available via core.thread and core.sync, though it isn't the flagship
 model).  std.concurrency is meant to be a high-level model useful for simple,
safe
 everyday concurrency, not the **only** be-all-and-end-all model of
multithreading
 in D.

 6. temporary unlock
 	There are often situations when you do lock-based programming, in which
 you need to temporarily unlock your mutex, perform some time consuming
 external task (disk i/o, ...) and then reaquire the mutex. For this
 feature, which is really important also because it is really difficult
 and dirty to work around it, needs language support, could be something
 like the inverse of a synchronized {} scope or the possibility to define
 a special kind of private member function that unlocks the mutex. Then,
 inside whose blocks the compiler of course has to make sure that the
 appropriate access rules are not broken (could be as conservative as
 disallowing access to any class member).

 Again, the point of std.concurrency is to be primarily message passing-based. 
It
 really sounds like what you want is a lower-level model.  Again, it's
available,
 but it's not considered the flagship model.

Agreed that the flagship model is message passing and to a degree I 
think that is quite reasonable (except that object orientation + message 
passing comes a bit too short IMO). However, I think the support for the 
rest is a bit too half hearted if you have to use casts for everything. 
There are quite some low hanging fruits where a simple syntax or library 
extension could increase the flexibility without sacrificing safety or 
complexity.

 9. unique

 Just tested this, and it doesn't compile.

Forgot the 'shared' in that example:

---
import std.concurrency;

synchronized class Test {
	void publicMethod(){
		spawn( function void(shared Test inst){ inst.privateMethod(); }, this );
	}
	
	private void privateMethod(){
	}
}
---

 	II. Implementation of a ThreadPool

 My std.parallelism module that's currently being reviewed for inclusion in
Phobos
 has a thread pool and task parallelism, though it is completely unsafe (i.e. it
 allows implicit sharing and will not be allowed in  safe code). 
std.concurrency
 was simply not designed for pull-out-all-stops parallelism, and
pull-out-all-stops
 parallelism is inherently harder than basic concurrency to make safe.  I've
given
 up making most std.parallelism safe, but I think I may be able to make a few
 islands of it safe.  The question is whether those islands would allow enough
 useful things to be worth the effort.  See the recent safe asynchronous
function
 calls thread.  Since it sounds like you need something like this, I'd sincerely
 appreciate your comments on this module.  The docs are at:

 http://cis.jhu.edu/~dsimcha/d/phobos/std_parallelism.html

 Code is at:

 http://dsource.org/projects/scrapple/browser/trunk/parallelFuture/std_parallelism.d


 	III. multiple threads computing separate parts of an array

 Also in the proposed std.parallelism module, though completely unsafe because
it
 needs to be fast.

I will definitely be looking into std.parallelism (I already have a 
thread pool, but that right now is not really sophisticated, mostly 
because of the previous lack of some advanced synchronization primitives).

Sep 14 2010

Michel Fortin <michel.fortin michelf.com> writes:

I must say I agree with most of your observations. Here are some comments...

On 2010-09-12 09:35:42 -0400, S�nke Ludwig 
<ludwig informatik.uni-luebeck.de> said:

 3. everything in implicit
 
 	This may seem kind of counter-intuitive, but using 'synchronized' 
 classes and features like setSameMutex - which are deadly necessary, it 
 is stupid to neglect the importance of lock based threading in an 
 object oriented environment - creates a feeling of climbing without a 
 safety rope. Not stating how you really want to synchronize/lock and 
 not being able to directly read  from the code how this is really done 
 just leaves a black-box feeling. This in turn means threading newcomers 
 will not be educated, they just use the system somehow and it magically 
 works. But as soon as you get problems such as deadlocks, you suddenly 
 have to understand the details and in this moment you have to read up 
 and remember everything that is going on in the background - plus 
 everything you would have to know about threading/synchronization in C. 
 I'm not sure if this is the right course here or if there is any better 
 one.

I'm a little uncomfortable with implicit synchronization too.

Ideally you should do as little as possible from inside a synchronized 
statement, and be careful about what functions you call (especially if 
they take some other lock). But the way synchronized classes work, they 
basically force you to do the reverse -- put everything under the lock 
-- and you don't have much control over it. Implicit synchronization is 
good for the simple getter/setter case, but for longer functions they 
essentially encourage a bad practice.


 6. temporary unlock
 
 	There are often situations when you do lock-based programming, in 
 which you need to temporarily unlock your mutex, perform some time 
 consuming external task (disk i/o, ...) and then reaquire the mutex. 
 For this feature, which is really important also because it is really 
 difficult and dirty to work around it, needs language support, could be 
 something like the inverse of a synchronized {} scope or the 
 possibility to define a special kind of private member function that 
 unlocks the mutex. Then, inside whose blocks the compiler of course has 
 to make sure that the appropriate access rules are not broken (could be 
 as conservative as disallowing access to any class member).

Well, you can work around this by making another shared class wrapping 
the synchronized class and making things you want to happen in a 
synchronized block functions of the synchronized class. But that 
certainly is a lot of trouble. I'd tend to say implicit synchronization 
is the problem.


 9. unique
 
 	Unique objects or chunks of data are really important not only to be 
 able to check that a cast to 'immutable' is correct, but also to allow 
 for passing objects to another thread for computations without making a 
 superfluous copy or doing superfluous computation.

Indeed, no-aliasing guaranties are important and useful, and not only 
for multithreading. But unique as a type modifier also introduce other 
complexities to the language, and I can understand why it was chosen 
not to add it to D2. I still wish we had it.


 11. holes in the system
 
 	It seems like there are a lot of ways in which you can still slip in 
 non-shared data into a shared context.

Your examples are just small bugs in spawn. They'll eventually get fixed.

If you want a real big hole in the type system, look at the destructor problem.
<http://d.puremagic.com/issues/show_bug.cgi?id=4621>

Some examples of bugs that slip by because of it:
<http://d.puremagic.com/issues/show_bug.cgi?id=4624>


 12. more practical examples need to be considered
 
 [...]
 
 	III. multiple threads computing separate parts of an array

If we had a no-aliasing guaranty in the type system (unique), we could 
make a "splitter" function that splits a unique array at the right 
positions and returns unique chunks which can be accessed independently 
by different cores with no race. You could then send each chunk to a 
different thread with correctness assured. Without this no-aliasing 
guaranty you can still implement this splitter function, but you're 
bound to use casts when using it (or suffer the penalty of atomic 
operations).

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 12 2010

=?ISO-8859-1?Q?S=F6nke_Ludwig?= <ludwig informatik.uni-luebeck.de> writes:

Am 12.09.2010 17:00, schrieb Michel Fortin:
 9. unique

 Unique objects or chunks of data are really important not only to be
 able to check that a cast to 'immutable' is correct, but also to allow
 for passing objects to another thread for computations without making
 a superfluous copy or doing superfluous computation.

 Indeed, no-aliasing guaranties are important and useful, and not only
 for multithreading. But unique as a type modifier also introduce other
 complexities to the language, and I can understand why it was chosen not
 to add it to D2. I still wish we had it.


 11. holes in the system

 It seems like there are a lot of ways in which you can still slip in
 non-shared data into a shared context.

 Your examples are just small bugs in spawn. They'll eventually get fixed.

 If you want a real big hole in the type system, look at the destructor
 problem.
 <http://d.puremagic.com/issues/show_bug.cgi?id=4621>

 Some examples of bugs that slip by because of it:
 <http://d.puremagic.com/issues/show_bug.cgi?id=4624>


 12. more practical examples need to be considered

 [...]

 III. multiple threads computing separate parts of an array

 If we had a no-aliasing guaranty in the type system (unique), we could
 make a "splitter" function that splits a unique array at the right
 positions and returns unique chunks which can be accessed independently
 by different cores with no race. You could then send each chunk to a
 different thread with correctness assured. Without this no-aliasing
 guaranty you can still implement this splitter function, but you're
 bound to use casts when using it (or suffer the penalty of atomic
 operations).

If the language allows for creating an array, splitting it and 
processing the chunks in separate threads - and that without any cast in 
the user part of the code + the user code is safely checked - I think 
everything would be fine. Of course a full solution in the language 
would be ideal, but my worries are more that in general you have to 
leave the checked part of the type system so often, that all that type 
checking might be completely useless as only the most simple threading 
constructs are checked. In that way a library solution that hides the 
casts and still guarantees (almost) safe behaviour would already be a 
huge step forward.

Maybe a UniqeArray(T) type that is library checked and that you can pass 
through spawn() would be a sufficient solution to at least this problem. 
It could make sure that T is POD and that only operations are allowed 
that still guarantee uniqueness of the elements.

Sep 14 2010

dsimcha <dsimcha yahoo.com> writes:

== Quote from S�nke_Ludwig (ludwig informatik.uni-luebeck.de)'s
 If the language allows for creating an array, splitting it and
 processing the chunks in separate threads - and that without any cast in
 the user part of the code + the user code is safely checked - I think
 everything would be fine. Of course a full solution in the language
 would be ideal, but my worries are more that in general you have to
 leave the checked part of the type system so often, that all that type
 checking might be completely useless as only the most simple threading
 constructs are checked. In that way a library solution that hides the
 casts and still guarantees (almost) safe behaviour would already be a
 huge step forward.
 Maybe a UniqeArray(T) type that is library checked and that you can pass
 through spawn() would be a sufficient solution to at least this problem.
 It could make sure that T is POD and that only operations are allowed
 that still guarantee uniqueness of the elements.

I thought about making a safe std.parallelism.map().  (There's currently an
unsafe
one.)   It's do-able under some limited circumstances but there are a few
roadblocks:

1.  The array passed in would have to be immutable, which also would make it
very
difficult to make map() work on generic ranges.

2.  The return value of the mapping function would not be allowed to have
unshared
aliasing.

3.  No user-supplied buffers for writing the result to.

A safe parallel foreach just Ain't Gonna Work (TM) because the whole point of
foreach is that it takes a delegate and everything reachable from the stack
frame
is visible in all worker threads.

Sep 15 2010

"Robert Jacques" <sandford jhu.edu> writes:

On Sun, 12 Sep 2010 09:35:42 -0400, S�nke Ludwig  
<ludwig informatik.uni-luebeck.de> wrote:
 9. unique

 	Unique objects or chunks of data are really important not only to be  
 able to check that a cast to 'immutable' is correct, but also to allow  
 for passing objects to another thread for computations without making a  
 superfluous copy or doing superfluous computation.

Unique (or for those with an Occam background 'mobile') has several  
proponents in the D community (myself included). It was seriously  
considered for inclusion in the type system, but Walter found several  
issues with it on a practical level. If I recall correctly, Walter's exact  
issues weren't made public, but probably stem from the fact that  
unique/mobile types in other languages are generally library defined and  
are 'shallow'. They exist as a 'please use responsibly' / 'here be  
dragons' feature. For unique to be safe, it needs to be transitive, but  
this severely limits the objects that can be represented. For example, a  
doubly-linked-list can not be unique. Unique has been integrated into the  
Clean and Mercury functional languages (or so says Wikipedia), so there  
might be reasonable solutions to these problems.

Sep 12 2010

=?ISO-8859-15?Q?S=F6nke_Ludwig?= <ludwig informatik.uni-luebeck.de> writes:

Am 12.09.2010 17:54, schrieb Robert Jacques:
 On Sun, 12 Sep 2010 09:35:42 -0400, S�nke Ludwig
 <ludwig informatik.uni-luebeck.de> wrote:
 9. unique

 Unique objects or chunks of data are really important not only to be
 able to check that a cast to 'immutable' is correct, but also to allow
 for passing objects to another thread for computations without making
 a superfluous copy or doing superfluous computation.

 Unique (or for those with an Occam background 'mobile') has several
 proponents in the D community (myself included). It was seriously
 considered for inclusion in the type system, but Walter found several
 issues with it on a practical level. If I recall correctly, Walter's
 exact issues weren't made public, but probably stem from the fact that
 unique/mobile types in other languages are generally library defined and
 are 'shallow'. They exist as a 'please use responsibly' / 'here be
 dragons' feature. For unique to be safe, it needs to be transitive, but
 this severely limits the objects that can be represented. For example, a
 doubly-linked-list can not be unique. Unique has been integrated into
 the Clean and Mercury functional languages (or so says Wikipedia), so
 there might be reasonable solutions to these problems.

Unique indeed seems to be a complicated problem when you want to make it 
flexible for the case of nested objects. However, I think it might 
already be very useful, even if it works well only with POD types and 
arrays. It would be interesting to collect different use cases and see 
what is really needed here so that an overall solution can be created later.

Sep 14 2010

D Programming

C/C++ Programming

Other

digitalmars.D - Some missing things in the current threading implementation