digitalmars.D - Proposals: Synchronization

Kent Boogaart (69/69) Jul 22 2006 Hi all,

Chris Miller (7/7) Jul 22 2006 I agree with everything in your proposal!
Kent Boogaart (6/76) Jul 24 2006 Are there no further opinions on this? Am I too late in bringing up stuf...

Carlos Santander (5/12) Jul 24 2006 First one, methinks. At least for me, it is. I like the idea, but I don'...

Sean Kelly (14/102) Jul 24 2006 Personally, I think 'synchronized' is slightly more meaningful, but it's...

Kent Boogaart (4/13) Jul 24 2006 I see that as a good thing, though. That is because the implicit syntax ...

Sean Kelly (8/22) Jul 24 2006 It will lock on the instance monitor (ie. 'this'), the static object
Regan Heath (60/77) Jul 24 2006 I'm not sure I agree that this "is a bad thing". I have done lots of

Kent Boogaart (78/78) Jul 25 2006 I feel the need for a more concrete example. Suppose you write this clas...

Regan Heath (39/126) Jul 25 2006 But not if you use my idea! ;)

Bruno Medeiros (25/111) Jul 30 2006 Hum, I'm attracted to the idea that D's Objects would not have to have

"Kent Boogaart" <kentcb internode.on.net> writes:

Hi all,

I have some ideas regarding D's synchronization support that I'd like to put 
forward. Any input would be much appreciated.

1. Change the "synchronization" keyword to "lock".

Pros:
   - easier and quicker to type
   - no confusion over spelling (it is spelt "synchronisation" in Australia, 
UK et al)

Cons:
   - none that I can think of

As a Java programmer, it used to annoy me typing "synchronization" every 

terser "lock" keyword.

2. Don't permit arbitrary locking of objects.

It is well accepted amongst the Java and .NET communities that allowing 
locking of arbitrary objects is a bad thing. For example, this is considered 
bad practice:

public void myMethod()
{
    ...
    lock (this)
    {
        ...
    }
}

It is bad because any other code could lock the object refered to by this. 
That can result in deadlocks, race conditions and all sorts of weird 
behavior. The accepted practice is:

private object _lock = new object();

public void myMethod()
{
     ...
     lock (_lock)
     {
         ...
     }
}

That way only the owning class can lock on the object.

So my suggestion is to disallow locking on arbitrary objects and change the 
lock keyword to only allow locking on a Phobos-provided Lock class like 
this:

private Lock _lock = new Lock(); //the Lock class or struct is implemented 
in Phobos

public void myMethod()
{
    lock (_lock)  //OK
    {
    }

     lock (this) {} //compile error

     lock (new object()) {} //compile error
}

I would also suggest NOT allowing this syntax:

public lock(_lock) void myMethod()
{
}

Because it is ugly and synchronization is an implementation detail that 
should be kept out of the method signature.

Pros:
   - the synch block can be removed from every object stored on the gc heap 
(that's a saving of at least 4 bytes per gc object - that's huge for 
applications that allocate a lot of small objects)
   - programmers are forced to lock in a safer manner. The problems of Java 
/ .NET locking on arbitrary objects are avoided.
   - D will be able to provide better diagnostics of locks as a program runs 
and during debug sessions. Locks could be named, for example

Cons:
   - none that I can think of

Love to hear what others think,
Kent Boogaart

Jul 22 2006

"Chris Miller" <chris dprogramming.com> writes:

I agree with everything in your proposal!


works because a string is just another object that can be locked on, and  
all string literals with the same text end up becoming one string object.  
This is very handy. Perhaps D could allow locking/synchronizing on string  
literals by having the compiler to create a static Lock instance for each  
unique string literal used for synchronization.

Jul 22 2006

"Kent Boogaart" <kentcb internode.on.net> writes:

Are there no further opinions on this? Am I too late in bringing up stuff 
like this or am I perhaps asking in the wrong place?

Thanks,
Kent


"Kent Boogaart" <kentcb internode.on.net> wrote in message 
news:e9urtc$ef5$1 digitaldaemon.com...
 Hi all,

 I have some ideas regarding D's synchronization support that I'd like to 
 put forward. Any input would be much appreciated.

 1. Change the "synchronization" keyword to "lock".

 Pros:
   - easier and quicker to type
   - no confusion over spelling (it is spelt "synchronisation" in 
 Australia, UK et al)

 Cons:
   - none that I can think of

 As a Java programmer, it used to annoy me typing "synchronization" every 

 terser "lock" keyword.

 2. Don't permit arbitrary locking of objects.

 It is well accepted amongst the Java and .NET communities that allowing 
 locking of arbitrary objects is a bad thing. For example, this is 
 considered bad practice:

 public void myMethod()
 {
    ...
    lock (this)
    {
        ...
    }
 }

 It is bad because any other code could lock the object refered to by this. 
 That can result in deadlocks, race conditions and all sorts of weird 
 behavior. The accepted practice is:

 private object _lock = new object();

 public void myMethod()
 {
     ...
     lock (_lock)
     {
         ...
     }
 }

 That way only the owning class can lock on the object.

 So my suggestion is to disallow locking on arbitrary objects and change 
 the lock keyword to only allow locking on a Phobos-provided Lock class 
 like this:

 private Lock _lock = new Lock(); //the Lock class or struct is implemented 
 in Phobos

 public void myMethod()
 {
    lock (_lock)  //OK
    {
    }

     lock (this) {} //compile error

     lock (new object()) {} //compile error
 }

 I would also suggest NOT allowing this syntax:

 public lock(_lock) void myMethod()
 {
 }

 Because it is ugly and synchronization is an implementation detail that 
 should be kept out of the method signature.

 Pros:
   - the synch block can be removed from every object stored on the gc heap 
 (that's a saving of at least 4 bytes per gc object - that's huge for 
 applications that allocate a lot of small objects)
   - programmers are forced to lock in a safer manner. The problems of Java 
 / .NET locking on arbitrary objects are avoided.
   - D will be able to provide better diagnostics of locks as a program 
 runs and during debug sessions. Locks could be named, for example

 Cons:
   - none that I can think of

 Love to hear what others think,
 Kent Boogaart

Jul 24 2006

Carlos Santander <csantander619 gmail.com> writes:

Kent Boogaart escribi�:
 Are there no further opinions on this? Am I too late in bringing up stuff 
 like this or am I perhaps asking in the wrong place?
 
 Thanks,
 Kent
 
 

First one, methinks. At least for me, it is. I like the idea, but I don't think 
it'll get in.

-- 
Carlos Santander Bernal

Jul 24 2006

Sean Kelly <sean f4.ca> writes:

Kent Boogaart wrote:
 Hi all,
 
 I have some ideas regarding D's synchronization support that I'd like to put 
 forward. Any input would be much appreciated.
 
 1. Change the "synchronization" keyword to "lock".
 
 Pros:
    - easier and quicker to type
    - no confusion over spelling (it is spelt "synchronisation" in Australia, 
 UK et al)
 
 Cons:
    - none that I can think of
 
 As a Java programmer, it used to annoy me typing "synchronization" every 

 terser "lock" keyword.

Personally, I think 'synchronized' is slightly more meaningful, but it's 
water under the bridge at this point.  I don't expect any keywords to 
change before 1.0.

 2. Don't permit arbitrary locking of objects.
 
 It is well accepted amongst the Java and .NET communities that allowing 
 locking of arbitrary objects is a bad thing. For example, this is considered 
 bad practice:
 
 public void myMethod()
 {
     ...
     lock (this)
     {
         ...
     }
 }
 
 It is bad because any other code could lock the object refered to by this. 
 That can result in deadlocks, race conditions and all sorts of weird 
 behavior. The accepted practice is:
 
 private object _lock = new object();
 
 public void myMethod()
 {
      ...
      lock (_lock)
      {
          ...
      }
 }
 
 That way only the owning class can lock on the object.
 
 So my suggestion is to disallow locking on arbitrary objects and change the 
 lock keyword to only allow locking on a Phobos-provided Lock class like 
 this:
 
 private Lock _lock = new Lock(); //the Lock class or struct is implemented 
 in Phobos
 
 public void myMethod()
 {
     lock (_lock)  //OK
     {
     }
 
      lock (this) {} //compile error
 
      lock (new object()) {} //compile error
 }
 
 I would also suggest NOT allowing this syntax:
 
 public lock(_lock) void myMethod()
 {
 }
 
 Because it is ugly and synchronization is an implementation detail that 
 should be kept out of the method signature.
 
 Pros:
    - the synch block can be removed from every object stored on the gc heap 
 (that's a saving of at least 4 bytes per gc object - that's huge for 
 applications that allocate a lot of small objects)
    - programmers are forced to lock in a safer manner. The problems of Java 
 / .NET locking on arbitrary objects are avoided.
    - D will be able to provide better diagnostics of locks as a program runs 
 and during debug sessions. Locks could be named, for example
 
 Cons:
    - none that I can think of

This would make this currently legal syntax illegal:

void fn() {
     synchronized {
         // stuff
     }
}

ie. where the synchronization object is implicit.  I suppose its value 
is debatable, but I think it's sufficiently useful that I wouldn't want 
it to be illegal.


Sean

Jul 24 2006

"Kent Boogaart" <kentcb internode.on.net> writes:

 This would make this currently legal syntax illegal:

 void fn() {
     synchronized {
         // stuff
     }
 }

 ie. where the synchronization object is implicit.  I suppose its value is 
 debatable, but I think it's sufficiently useful that I wouldn't want it to 
 be illegal.

I see that as a good thing, though. That is because the implicit syntax is 
locking on this (I assume), which - as discussed - is a bad thing.

Regards,
Kent Boogaart

Jul 24 2006

Sean Kelly <sean f4.ca> writes:

Kent Boogaart wrote:
 This would make this currently legal syntax illegal:

 void fn() {
     synchronized {
         // stuff
     }
 }

 ie. where the synchronization object is implicit.  I suppose its value is 
 debatable, but I think it's sufficiently useful that I wouldn't want it to 
 be illegal.

 
 I see that as a good thing, though. That is because the implicit syntax is 
 locking on this (I assume), which - as discussed - is a bad thing.

It will lock on the instance monitor (ie. 'this'), the static object 
monitor (ie. Classname.classinfo), or the global monitor, depending on 
context.

By the way, I agree that it's a Bad Idea to lock on an object that 
internally locks on 'this', but I'm not certain that it's something 
worth having the compiler enforce.


Sean

Jul 24 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Tue, 25 Jul 2006 07:19:27 +0930, Kent Boogaart  
<kentcb internode.on.net> wrote:
 This would make this currently legal syntax illegal:

 void fn() {
     synchronized {
         // stuff
     }
 }

 ie. where the synchronization object is implicit.  I suppose its value  
 is
 debatable, but I think it's sufficiently useful that I wouldn't want it  
 to
 be illegal.

 I see that as a good thing, though. That is because the implicit syntax  
 is
 locking on this (I assume), which - as discussed - is a bad thing.

I'm not sure I agree that this "is a bad thing". I have done lots of  
multithreaded programming in C where you lock a seperate object (a mutex)  
to the thing to which you want to synchronize access. I have also done  
some (not a huge amount) of Java proghamming where I locked the object  
itself. I can't say that I got less deadlocks in C, than I did in Java.

I very much prefer not having to declare, create and use a seperate object  
when I can just lock the thing I want to synchronize.

In my experience deadlocks occur whenever 2 or more pieces of code lock 2  
or more objects and where each piece of code does it in a different order.  
eg.

[thread1]
lock A
..thread swap occurs..
lock B

[thread2]
lock B
..thread swap occurs..
lock A

If thread1 locks A then the CPU swaps threads and thread2 locks B .. you  
have a deadlock. Neither thread can progress. If, however, you always lock  
in a specific order i.e. A then B always, you can't get a deadlock (in my  
experience).

So, rather than seperating the object being locked (the mutex/lock class)  
 from the object being synchronized, which just seems like bad design to  
me, and IMO doesn't actually do anything to solve the problem, what about  
some idea to ensure the order in which things are locked?

Off the top of my head.. (and this is something a threading/locking  
library could do..)

1. Give each object being locked a priority:
object1: 0
object2: 1
object3: 2
object4: 3

2. If a lock is required on an object and the thread already has a lock on  
an object with a higher priority, release the lock on the higher priority  
object, obtain the new lock, and then re-obtain the higher priority lock.

Example:

If thread A locks object2, then locks object1, we release the lock for  
object1 before obtaining the lock on object2, then we re-obtain the lock  
on object2. So, using our previous example:

[priority]
lockA: 0
lockB: 1

[thread1]
lock A
..thread swap occurs..
lock B

[thread2]
lock B
..thread swap occurs..
lock A

thread1 obtains lock A {thread swap} thread2 obtains lock B, thread2  
requests lock A, we release lock B (higher priority than A), wait for A  
(which is locked by thread1 already) {thread swap}, thread1 obtains lock  
B, continues.. releases locks.. {thread swap} thread2 obtains lock A,  
thread2 asks for and obtains lock B..

and like magic.. no more deadlocks ;)

Regan

Jul 24 2006

"Kent Boogaart" <kentcb internode.on.net> writes:

I feel the need for a more concrete example. Suppose you write this class 
(pseudo code):

public class MyClass
{
    //data protected by lock(this)
    private object _data1;
    private object _data2;
    //data protected by lock(_list);
    private object _data3;
    private object _data4;
    private List _list;

    ...

    public void someMethod()
    {
        lock (this)
        lock (_list)
        {
            //do something with data1 through data4
            _list.someListMethod();
        }
    }

    public void someOtherMethod()
    {
        _list.someListMethod();
    }
}

Seems pretty innocent, right? Now suppose you didn't write the List class. 
Someone else did or maybe it's part of Phobos or the .NET FCL or the Java 
class libraries. And suppose it looks a bit like this:

public class List
{
    private object _owner;

    public List(object owner)
    {
        _owner = owner;
    }

    public void someListMethod()
    {
        lock (this)
        lock (_owner)
        {
            //do something
        }
    }
}

See the problem? Take 2 threads: A and B. Thread A enters 
MyClass.someMethod() and takes a lock on this (ie. the instance of MyClass). 
Then it is preempted and thread B enters someOtherMethod(). It goes into 
List.someListMethod() and takes a lock on this (ie. the instance of List). 
It then attempts to lock on _owner (ie. the instance of MyClass). It can't 
because thread A has that lock. Thread A can't continue because thread B has 
the lock on the list instance. Bang - deadlock.

This example may seem a little contrived but this kind of thing has happened 
and continues to happen in the real world. Why? Because any object instance 
is lockable and *developers have little to no way of knowing what locks are 
being taken by the code they're calling and in what order*. Because any 
object can be locked, the following is true:
1. Developers will often just lock on this (or the TypeInfo when in static 
scope) because it is "easier" and "cleaner". This ensures the problems 
discussed above can occur. If developers had to explicitly create an object 
to lock on, they will likely think harder about whether they expose that 
object for others to lock. Most (all?) of the time they will realize that 
there is no point doing that.
2. Malicious code can deliberately lock objects so that the host is 
deadlocked if it also attempts to lock on those objects (for example, 
consider a .NET assembly loaded into SQL Server 2005's CLR or a Java class 
loaded into Eclipse)
3. As I already pointed out, every object carries around at least 4 extra 
bytes on the heap - "just in case" it is ever locked. To me that is 
enormously wasteful because most objects are never locked and I never lock 
on public objects for the reasons above. I'd much rather be in direct 
control of the memory being used to support locking.

Sorry to bang on about this. What can I say? I'm passionate. I daresay that 
if you asked the designers of the .NET and Java threading support they will 
express regret at allowing locks on arbitrary objects. Heck, I'd be happy to 
do that for you if you like . . . ?

Regards,
Kent Boogaart

Jul 25 2006

"Regan Heath" <regan netwin.co.nz> writes:

On Tue, 25 Jul 2006 21:59:36 +0930, Kent Boogaart  
<kentcb internode.on.net> wrote:
 I feel the need for a more concrete example. Suppose you write this class
 (pseudo code):

 public class MyClass
 {
     //data protected by lock(this)
     private object _data1;
     private object _data2;
     //data protected by lock(_list);
     private object _data3;
     private object _data4;
     private List _list;

     ...

     public void someMethod()
     {
         lock (this)
         lock (_list)
         {
             //do something with data1 through data4
             _list.someListMethod();
         }
     }

     public void someOtherMethod()
     {
         _list.someListMethod();
     }
 }

 Seems pretty innocent, right? Now suppose you didn't write the List  
 class.
 Someone else did or maybe it's part of Phobos or the .NET FCL or the Java
 class libraries. And suppose it looks a bit like this:

 public class List
 {
     private object _owner;

     public List(object owner)
     {
         _owner = owner;
     }

     public void someListMethod()
     {
         lock (this)
         lock (_owner)
         {
             //do something
         }
     }
 }

 See the problem? Take 2 threads: A and B. Thread A enters
 MyClass.someMethod() and takes a lock on this (ie. the instance of  
 MyClass).
 Then it is preempted and thread B enters someOtherMethod(). It goes into
 List.someListMethod() and takes a lock on this (ie. the instance of  
 List).
 It then attempts to lock on _owner (ie. the instance of MyClass). It  
 can't
 because thread A has that lock. Thread A can't continue because thread B  
 has
 the lock on the list instance. Bang - deadlock.

But not if you use my idea! ;)

If you remove object locking and re-write the code above, you'll still get  
a deadlock. Why? Because 'owner' or whatever you can actually lock will  
still be shared and will still be locked 2nd in one place and 1st in  
another and that is what causes the deadlock.

 This example may seem a little contrived but this kind of thing has  
 happened
 and continues to happen in the real world. Why?

Because people don't understand multithreading (because it's hard) and  
because people write bad code.

 Because any object instance
 is lockable and *developers have little to no way of knowing what locks  
 are being taken by the code they're calling and in what order*.Because  
 any object can be locked, the following is true:
 1. Developers will often just lock on this (or the TypeInfo when in  
 static scope) because it is "easier" and "cleaner". This ensures the  
 problems
 discussed above can occur.

I disagree that it "ensures" the problem can occur. I believe the *order*  
in which the locks are taken cause the problem, not what can or cannot be  
locked.

One thing removing object locking will do, IMO, is decrease the number of  
things you have to consider _may_ be locked and causing a particular  
deadlock which you're hunting for. In other words it will also make it  
more obvious where locking is or can occur.

 If developers had to explicitly create an object
 to lock on, they will likely think harder about whether they expose that
 object for others to lock.

I agree that having to create an object and then explicitly share it will  
make it more obvious where a deadlock could occur (but only to half  
competant programmer). It wont prevent one from happing however because as  
soon as the "object for others to lock" is shared you introduce the  
possibility that the same locks will be taken in a different order in 2  
different peices of code.

 Most (all?) of the time they will realize that
 there is no point doing that.

Why? Either they need to lock it, or they don't. If they do, they have to  
share it, if they share it .. a deadlock can occur.

 2. Malicious code can deliberately lock objects so that the host is
 deadlocked if it also attempts to lock on those objects (for example,
 consider a .NET assembly loaded into SQL Server 2005's CLR or a Java  
 class loaded into Eclipse)

There is no cure for stupidity, or malice.. except perhaps a bigger gun. ;)

 3. As I already pointed out, every object carries around at least 4 extra
 bytes on the heap - "just in case" it is ever locked. To me that is
 enormously wasteful because most objects are never locked and I never  
 lock on public objects for the reasons above. I'd much rather be in  
 direct
 control of the memory being used to support locking.

I dislike waste too but you don't get something for nothing. That said, if  
we did remove those 4 bytes perhaps we could use template bolt-ons to make  
existing classes lockable, something like..

template Lockable(Base) : Base {
   ..code to lock/unlock etc..
}

auto f = new Lockable!(BufferedFile)();
..etc..

this solution would also make it more obvious things were being locked.

 Sorry to bang on about this. What can I say? I'm passionate.

Nothing wrong with that.

 I daresay that if you asked the designers of the .NET and Java threading  
 support they will express regret at allowing locks on arbitrary objects.  
 Heck, I'd be happy to do that for you if you like . . . ?

Sure, link me to their reply :)

In fact, link me to any discussions in .NET or Java on this topic that you  
know about. I'm interested in this also.

Regan

Jul 25 2006

Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:

Sean Kelly wrote:
 Kent Boogaart wrote:
 2. Don't permit arbitrary locking of objects.

 It is well accepted amongst the Java and .NET communities that 
 allowing locking of arbitrary objects is a bad thing. For example, 
 this is considered bad practice:

 public void myMethod()
 {
     ...
     lock (this)
     {
         ...
     }
 }

 It is bad because any other code could lock the object refered to by 
 this. That can result in deadlocks, race conditions and all sorts of 
 weird behavior. The accepted practice is:

 private object _lock = new object();

 public void myMethod()
 {
      ...
      lock (_lock)
      {
          ...
      }
 }

 That way only the owning class can lock on the object.

 So my suggestion is to disallow locking on arbitrary objects and 
 change the lock keyword to only allow locking on a Phobos-provided 
 Lock class like this:

 private Lock _lock = new Lock(); //the Lock class or struct is 
 implemented in Phobos

 public void myMethod()
 {
     lock (_lock)  //OK
     {
     }

      lock (this) {} //compile error

      lock (new object()) {} //compile error
 }

 I would also suggest NOT allowing this syntax:

 public lock(_lock) void myMethod()
 {
 }

 Because it is ugly and synchronization is an implementation detail 
 that should be kept out of the method signature.

 Pros:
    - the synch block can be removed from every object stored on the gc 
 heap (that's a saving of at least 4 bytes per gc object - that's huge 
 for applications that allocate a lot of small objects)
    - programmers are forced to lock in a safer manner. The problems of 
 Java / .NET locking on arbitrary objects are avoided.
    - D will be able to provide better diagnostics of locks as a 
 program runs and during debug sessions. Locks could be named, for example

 Cons:
    - none that I can think of

 
 This would make this currently legal syntax illegal:
 
 void fn() {
     synchronized {
         // stuff
     }
 }
 
 ie. where the synchronization object is implicit.  I suppose its value 
 is debatable, but I think it's sufficiently useful that I wouldn't want 
 it to be illegal.
 
 
 Sean

Hum, I'm attracted to the idea that D's Objects would not have to have 
the monitor data. (altough I can't clearly say that the performance gain 
will be significative)

I understand the "synchronized { ... }" statement might be useful, so 
perhaps we can have the best of both worlds? What if the implicit 
synchronized statement would call, similarly to operator overloading, a 
predefined lock function (or monitor member). Then we could have a mixin 
that would make a class lockable, by adding a monitor member, and 
perhaps a lock method. Such mixin should be defined by the standard lib, 
so that the user would not be required to write one. Then we would have 
code like this:

   class Foo {
     mixin Lockable; // std.thread.Lockable ?

     void fn() {
       synchronized {
         ... // stuff
       }
     }
   }

which isn't that much more verbose, and now regular Objects would not 
have the monitor member.

-- 
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Jul 30 2006

D Programming

C/C++ Programming

Other

digitalmars.D - Proposals: Synchronization