www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Idea for Threads

reply "Craig Black" <cblack ara.com> writes:
Correct me if I'm wrong, but the synchronize statement can be used to make a 
function, block of code, or variable atomic.  That is, only one thread at a 
time can access it. However, one very important objective of multithreading 
is to make programs faster by using todays multi-core processors.  Using 
synchronize too much would make things run slower, because it could cause a 
lot of thread contention.

I was thinking about this when I got an idea.  It would require another 
keyword that could be used to mark a function or block of code.  Perhaps 
"threaded" or "threadsafe".  This keyword would not force the code to be 
atomic.  Instead, it would cause the compiler to issue errors when the code 
does something that is not thread safe, like writing to nonsynchronized 
data.

I don't have a lot of experienct with threads so I don't know all the 
implications here.  I'm not sure if the compiler has enough knowledge to 
prohibit everything that could be the source of threading problems.  But 
even if it could enforce the most common bugs then that would be a very good 
thing.

Comments?

-Craig 
May 11 2007
next sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Craig Black schrieb am 2007-05-11:
 Correct me if I'm wrong, but the synchronize statement can be used to make a 
 function, block of code, or variable atomic.  That is, only one thread at a 
 time can access it. However, one very important objective of multithreading 
 is to make programs faster by using todays multi-core processors.  Using 
 synchronize too much would make things run slower, because it could cause a 
 lot of thread contention.

 I was thinking about this when I got an idea.  It would require another 
 keyword that could be used to mark a function or block of code.  Perhaps 
 "threaded" or "threadsafe".  This keyword would not force the code to be 
 atomic.  Instead, it would cause the compiler to issue errors when the code 
 does something that is not thread safe, like writing to nonsynchronized 
 data.

 I don't have a lot of experienct with threads so I don't know all the 
 implications here.  I'm not sure if the compiler has enough knowledge to 
 prohibit everything that could be the source of threading problems.  But 
 even if it could enforce the most common bugs then that would be a very good 
 thing.
This is an interesting idea, however the limitations for "threadsafe" code would be: * no reference type arguments to the "threadsafe" function * no synchronized statements * no use of function pointers / delegates * no non-final class function calls * void pointers would require quite advanced compiler support * due to the current GC implementation: no non-scope allocations, no .length changes * as a consequence of the GC issue: no reference type return statement from the "threadsafe" function * the "threadsafe" function has to be 1) at module level or 2) a "static" struct function or 3) a "final static" class function Most likely some restrictions are missing but this should give you an idea. Some of those restrictions only apply to the top level "threadsafe" function. Depending on the sophistication of the compiler some limitation for functions called by the top level one might be lifted. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFGRWmMLK5blCcjpWoRAqQsAKCS2jPgQ+iemR8a1pvOqnFZNFZQuQCbBv+E uod5DyfHs1ir4cAe0kHXDHY= =f/Dj -----END PGP SIGNATURE-----
May 12 2007
next sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Thomas Kuehne wrote:
 Craig Black schrieb am 2007-05-11:
 Correct me if I'm wrong, but the synchronize statement can be used to make a 
 function, block of code, or variable atomic.  That is, only one thread at a 
 time can access it. However, one very important objective of multithreading 
 is to make programs faster by using todays multi-core processors.  Using 
 synchronize too much would make things run slower, because it could cause a 
 lot of thread contention.

 I was thinking about this when I got an idea.  It would require another 
 keyword that could be used to mark a function or block of code.  Perhaps 
 "threaded" or "threadsafe".  This keyword would not force the code to be 
 atomic.  Instead, it would cause the compiler to issue errors when the code 
 does something that is not thread safe, like writing to nonsynchronized 
 data.

 I don't have a lot of experienct with threads so I don't know all the 
 implications here.  I'm not sure if the compiler has enough knowledge to 
 prohibit everything that could be the source of threading problems.  But 
 even if it could enforce the most common bugs then that would be a very good 
 thing.
This is an interesting idea, however the limitations for "threadsafe" code would be: * no reference type arguments to the "threadsafe" function * no synchronized statements * no use of function pointers / delegates * no non-final class function calls * void pointers would require quite advanced compiler support * due to the current GC implementation: no non-scope allocations, no .length changes * as a consequence of the GC issue: no reference type return statement from the "threadsafe" function * the "threadsafe" function has to be 1) at module level or 2) a "static" struct function or 3) a "final static" class function Most likely some restrictions are missing but this should give you an idea. Some of those restrictions only apply to the top level "threadsafe" function. Depending on the sophistication of the compiler some limitation for functions called by the top level one might be lifted. Thomas
Personally, I think the future of threading is not in making it easier for programmers to write threaded code, but to make compilers smart enough to automatically thread code. I mean, from what I've seen, humans have displayed a real nack for not being able to write multi-threaded code in any sane way. I just don't think we're wired up the right way. That's why I've suggested things in the past like the concept of a "pure" function--one which has no side effects. If a function has no side-effects, then the compiler can thread it automatically. List comprehensions and other functional features would help here, too, allowing for loop parallelism. I tell you what; the person who comes up with a general-purpose C-style language that makes multithreading brain-dead simple will be one seriously rich bugger. Just my AU$0.02. -- Daniel -- int getRandomNumber() { return 4; // chosen by fair dice roll. // guaranteed to be random. } http://xkcd.com/ v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
May 12 2007
next sibling parent janderson <askme me.com> writes:
[snip]
 
 Personally, I think the future of threading is not in making it easier
 for programmers to write threaded code, but to make compilers smart
 enough to automatically thread code.
 
 I mean, from what I've seen, humans have displayed a real nack for not
 being able to write multi-threaded code in any sane way.  I just don't
 think we're wired up the right way.
 
 That's why I've suggested things in the past like the concept of a
 "pure" function--one which has no side effects.  If a function has no
 side-effects, then the compiler can thread it automatically.  List
 comprehensions and other functional features would help here, too,
 allowing for loop parallelism.
 
 I tell you what; the person who comes up with a general-purpose C-style
 language that makes multithreading brain-dead simple will be one
 seriously rich bugger.
 
 Just my AU$0.02.
 
 	-- Daniel
 
Very true. -Joel
May 12 2007
prev sibling next sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Keep schrieb am 2007-05-12:
 Thomas Kuehne wrote:
 Craig Black schrieb am 2007-05-11:
 Correct me if I'm wrong, but the synchronize statement can be used to make a 
 function, block of code, or variable atomic.  That is, only one thread at a 
 time can access it. However, one very important objective of multithreading 
 is to make programs faster by using todays multi-core processors.  Using 
 synchronize too much would make things run slower, because it could cause a 
 lot of thread contention.

 I was thinking about this when I got an idea.  It would require another 
 keyword that could be used to mark a function or block of code.  Perhaps 
 "threaded" or "threadsafe".  This keyword would not force the code to be 
 atomic.  Instead, it would cause the compiler to issue errors when the code 
 does something that is not thread safe, like writing to nonsynchronized 
 data.
[...]
 Personally, I think the future of threading is not in making it easier
 for programmers to write threaded code, but to make compilers smart
 enough to automatically thread code.
I think a combination of both approaches will yield the best results. I especially like GCC's --Wunsafe-loop-optimization and --Wdisabled-optimization. Combining those with cross module optimization, automatic threading and a "tell me if you can't auto-thread this function" attribute/pragma should result in a really helpful compiler.
 That's why I've suggested things in the past like the concept of a
 "pure" function--one which has no side effects.  If a function has no
 side-effects, then the compiler can thread it automatically.
# int foo(int* i){ # *i += 1; # retrun *i; # } foo clearly isn't a "pure" function, thus can't be called by another "pure" function. # int bar(int i){ # int j = i * i; # return foo(&j); # } bar calls an "unpure" function thus would normally not be considered "pure". However bar is "pure" - there are no side effects <g> If your definition of "pure" includes bar it might be of use for the majority of C style coders. If it doesn't consider bar a "pure" function a lot of C style coders will have to re-train to use the features of your smart compiler. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFGRaFqLK5blCcjpWoRAuzEAKCIxsIcBSH/B5EN+60uVF5Dd77X5QCeK4CK UP5A1PTOqct48MQ8YieRzDk= =8s6H -----END PGP SIGNATURE-----
May 12 2007
parent Nicolai Waniek <no.spam thank.you> writes:
Thomas Kuehne wrote:
 If your definition of "pure" includes bar it might be of use for the
 majority of C style coders. If it doesn't consider bar a "pure" function
 a lot of C style coders will have to re-train to use the features of
 your smart compiler.
I don't want to say anything about threading, but I'd say the compiler/language shouldn't be designed for C coders, but the C coders should look at the compiler/language spec. In the case the coder still likes to stick to the C way of doing, he should definitely stick to C. best regards, Nicolai
May 12 2007
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Daniel Keep, el 12 de mayo a las 17:58 me escribiste:
 That's why I've suggested things in the past like the concept of a
 "pure" function--one which has no side effects.
Like Haskel (and other functional languages). Maybe some experience could be collected from it. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ .------------------------------------------------------------------------, \ GPG: 5F5A8D05 // F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05 / '--------------------------------------------------------------------' Peperino nos enseña que debemos ofrendirnos con ofrendas de vino si queremos obtener la recompensa de la parte del medio del vacío. -- Peperino Pómoro
May 12 2007
prev sibling parent Manfred Nowak <svv1999 hotmail.com> writes:
Daniel Keep wrote

 Personally, I think the future of threading is not in making it
 easier for programmers to write threaded code, but to make
 compilers smart enough to automatically thread code.
This is not possible for compilers in general because - algorithms for single or quasi-single processors may loose performance drastically when the number of threads supportable by the hardware increases, i.e. more processors become available. - in a system that dynamically assigns processors to a process the compiler has to know in advance the effects for every possible change of the assigned number of processors -manfred
May 12 2007
prev sibling parent reply "Craig Black" <cblack ara.com> writes:
 This is an interesting idea, however the limitations for "threadsafe"
 code would be:

 * no reference type arguments to the "threadsafe" function
What if the references were read only?
 * no synchronized statements
Why not?
 * no use of function pointers / delegates
Because a threadsafe function shouldn't call a non threadsafe function, right? Perhaps it would be possible to have threadsafe delegates that could only be assigned with threadsafe functions.
 * no non-final class function calls
What do you mean by non-final? Do you mean no virtual function calls? Does this have something to do with non-threadsafe functions being prohibited?
 * void pointers would require quite advanced compiler support
void pointers should probably be avoided.
 * due to the current GC implementation:
     no non-scope allocations, no .length changes
 * as a consequence of the GC issue:
     no reference type return statement from the "threadsafe" function
 * the "threadsafe" function has to be
     1) at module level or
     2) a "static" struct function or
     3) a "final static" class function
Perhaps it could be a local member function if access to its class data members was read only.
 Most likely some restrictions are missing but this should give you an
 idea.
It's a good start.
 Some of those restrictions only apply to the top level "threadsafe"
function.
 Depending on the sophistication of the compiler some limitation for
 functions called by the top level one might be lifted.
Not sure what you mean. -Craig
May 12 2007
parent Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Craig Black schrieb am 2007-05-13:
 This is an interesting idea, however the limitations for "threadsafe"
 code would be:

 * no reference type arguments to the "threadsafe" function
What if the references were read only?
References that point to immuteable data - or an immuteable view - are basically just fancy value types and thus allowed. "read only" in the sence of: can be changed by another thread but not this one is generally illegal unless it can be ensured that no write thread is executed during the "threadsafe" function's lifetime.
 * no synchronized statements
Why not?
synchronization via stack objects: causes idle dead locks once a second synchronize for the same object is encountered - there is only one stack/thread and no "try_synchronized". A single "synchronize" in the context of a "threadsafe" function has no effect. synchronization via heap objects: not thread safe: all kinds of dead locks
 * no use of function pointers / delegates
Because a threadsafe function shouldn't call a non threadsafe function, right? Perhaps it would be possible to have threadsafe delegates that could only be assigned with threadsafe functions.
For function pointers this is possible but delegates would also require "threadsafe" object that a guaranteed not to be used for synchronization (see above).
 * no non-final class function calls
What do you mean by non-final? Do you mean no virtual function calls? Does this have something to do with non-threadsafe functions being prohibited?
A really advanced compiler may allow seemingly virtual function calls. Basically you have to know exactly what - if any - derived classes could be encountered and that none of the potential objects is used in a "synchronized" statement. Basically the compiler turned the virtual call into a non-virtual one with a constrained object argument.
 * due to the current GC implementation:
     no non-scope allocations, no .length changes
 * as a consequence of the GC issue:
     no reference type return statement from the "threadsafe" function
 * the "threadsafe" function has to be
     1) at module level or
     2) a "static" struct function or
     3) a "final static" class function
Perhaps it could be a local member function if access to its class data members was read only.
Again the only-for-objects-without-synchronized-use limitation.
 Some of those restrictions only apply to the top level "threadsafe"
function.
 Depending on the sophistication of the compiler some limitation for
 functions called by the top level one might be lifted.
Not sure what you mean.
# size_t getLen(Object o){ # return o.toString().length; # } # class C : Object{ # char[] toString(){ # return "123"; # } # } # # size_t foo(){ # scope Object o = new C(); # return getLen(o); # } getLen is definetly not threadsafe. However foo - even though it is calling getLen - is thread safe. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFGRxgGLK5blCcjpWoRAkU2AKCnKoW/qQ+SmJIqhiC//rutu0M5JgCfaxLS K4kLhwZeCY6KKio4Ce1mQA4= =xxZx -----END PGP SIGNATURE-----
May 13 2007
prev sibling next sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Craig Black wrote

 However, one very important objective of multithreading 
 is to make programs faster by using todays multi-core processors.
There is at least one simple test, whether a language is prepared for parallel execution: the computation of the parallel or: por. por( arg1, ..., argn) - evaluates all of its arguments simultaneously - returns true as soon as one of its arguments turns out to be true - stops all evaluations of its arguments, that are still running, as soon as it returns. -manfred
May 12 2007
parent 0ffh <spam frankhirsch.net> writes:
Manfred Nowak wrote:
 There is at least one simple test, whether a language is prepared for 
 parallel execution: the computation of the parallel or: por.
I suppose it's dual pand should then be just as suited. With pxor, alas, the short-circuiting might get a bit tricky... :) regards, Frank
May 12 2007
prev sibling next sibling parent reply Martin Persenius <martin persenius.net> writes:
Thomas Kuehne Wrote:
 Most likely some restrictions are missing but this should give you an
 idea.
 
 Thomas
I have been thinking about this a bit and have a couple of ideas to play with. I want to test one with you. One of the problems is memory corruption because reading is done when a write is in progress. This can be avoided by the use of mutexes. I think this should be automated through the use of protected types, which would give the programmer less to think about. The lock needs to be effective for both reads and writes, but only writes need to lock. Already objects have a mutex upon creation (read it in a post from 2006), this could be used for the protected types as such: 1. Before reading, check if it is locked, then wait or read. 2. Before writing, acquire lock, write, release. The protected types could be the regular names suffixed with _p (e.g. uint_p. Now, this doesn't solve all issues that could result from the use of pointers, but if you just avoid that, automatic locking would at least simplify the matter. I think this sounds like a pretty straightforward idea, so I'd be happy to hear any outstanding objections. (it would benefit from having structs capable of opAssign, or language integration) Example which always locks... not as fast as only checking lock on read: private import tango.io.Stdout, tango.util.locks.Mutex; class Protected(T) { private T item; Mutex m; this() { m = new Mutex(); } void opAssign(T v) { m.acquire(); scope(exit) m.release; item = v; } T opCall() { m.acquire(); scope(exit) m.release(); return item; } } int main() { auto x = new Protected!(int); x = 5; Stdout.formatln("x.item = {0}", x()); return 0; }
May 13 2007
next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Martin Persenius wrote:
 1. Before reading, check if it is locked, then wait or read.
 2. Before writing, acquire lock, write, release.
That's not safe, there's a race condition. Thread A: Check if locked, begin reading.
 thread switch <<<
Thread B: Acquire lock, write, release.
 thread switch <<<
Thread A: Continue reading. Thread A will still have the stuff being read changing from under it... The writing procedure needs to be modified to check if anyone's currently reading and, if so, wait until they're done. This means the readers also need to mark something while they're busy. It'd have to be some kind of counter since multiple simultaneous readers are allowed. If you want new readers to wait until a waiting writer has done it's thing the readers also need to actually lock something, though perhaps only at the beginning and end, not while they're working. Some googling reveals that pthreads has pthread_rwlock* to implement this[1]. [1]: See http://www.die.net/doc/linux/man/man3/pthread_rwlock_init.3.html and related pages.
May 13 2007
parent reply Martin Persenius <martin persenius.net> writes:
Frits,

You (I) learn something new everyday - thanks!

What do you think about types with automatic locking then? Not specifically my
fouled up attempt.

I suppose I need to become more familiar with the exact details of the problems
to be overcome in threading.

Martin
May 13 2007
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Martin Persenius wrote:
 Frits,
 
 You (I) learn something new everyday - thanks!
You're welcome.
 What do you think about types with automatic locking then? Not specifically my
fouled up attempt.
It's a nice idea, but probably not easy to implement in a way that's intuitively "right". If the type has multiple fields, for example, you might want to keep the lock over multiple accesses. I'm not sure if that can be done nicely in the current language. If "smart references" (akin to C++ "smart pointers") were possible that could be a good way to implement it though. But overloading "." is currently not possible...
 I suppose I need to become more familiar with the exact details of the
problems to be overcome in threading.
I'm not terribly familiar with them either. I just noticed a race condition :).
May 13 2007
prev sibling parent reply "Craig Black" <cblack ara.com> writes:
 One of the problems is memory corruption because reading is done when a 
 write is in progress. > This can be avoided by the use of mutexes.
I was thinking about this. An efficient mutex implementation should take into consideration read and write access. If there is a write, then all access to the data should be prohibited until the write is completed. However, any number of reads should be able to work together in parallel. If a read is treaded the same as a write, then that would be very inefficient. I don't know a lot about the details of mutexes. Do they multiple reads simultaneously? -Craig
May 14 2007
parent reply Sean Kelly <sean f4.ca> writes:
Craig Black wrote:
 One of the problems is memory corruption because reading is done when a 
 write is in progress. > This can be avoided by the use of mutexes.
I was thinking about this. An efficient mutex implementation should take into consideration read and write access. If there is a write, then all access to the data should be prohibited until the write is completed. However, any number of reads should be able to work together in parallel. If a read is treaded the same as a write, then that would be very inefficient. I don't know a lot about the details of mutexes. Do they multiple reads simultaneously?
ReadWrite mutexes do, but they're a bit more complicated than your average mutex. Sean
May 14 2007
parent Downs <default_357-line yahoo.de> writes:
Sean Kelly wrote:
 Craig Black wrote:
 I don't know a lot about the details of mutexes.  Do they multiple 
 reads simultaneously?
ReadWrite mutexes do, but they're a bit more complicated than your average mutex.
About like so? class extlock { Thread writing=null; int reading=0; int wfwl=0; /// waiting for write lock private bool lock(bool exclusive)() { synchronized(this) { static if (exclusive) if (writing||(reading>0)) return false; else { writing=Thread.getThis; return true; } else if (writing||wfwl) return false; else { reading++; return true; } } } void write_lock() { synchronized(this) wfwl++; while (!lock!(true)) rest; synchronized(this) wfwl--; } void read_lock() { while (!lock!(false)) rest; } // the asserts are unsynced because they're not part of the normal flow // and don't strictly need to be threadsafe. void write_unlock() { assert(writing==Thread.getThis); synchronized(this) writing=null; } void read_unlock() { assert(!writing); assert(reading>0); synchronized(this) reading--; } } scope class readlock { extlock s; this(typeof(s)s) { this.s=s; s.read_lock; } ~this() { s.read_unlock; } } scope class writelock { extlock s; this(typeof(s)s) { this.s=s; s.write_lock; } ~this() { s.write_unlock; } } unittest { auto sync=new extlock; assert(sync); sync.read_lock; sync.read_unlock; { scope wl=new writelock(sync); } }
May 14 2007
prev sibling parent Manfred Nowak <svv1999 hotmail.com> writes:
Craig Black wrote

 However, one very important objective of multithreading 
 is to make programs faster by using todays multi-core processors. 
Tomorrow at 9:00 AM PDT there is a free "webinar" from Intel: "Three Steps to Threading and Performance Part 2 - Expressing Parallelism: Case Studies with Intel Threading Building Blocks" -manfred
May 14 2007