www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Common Issue in Shared Code

--bcaec52be88d4cb54304b2325095
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

About a month or so ago, I started trying to convert a codebase I've been
working on into a multithreaded system, and I've been hitting this sort of
thing over and over:
--------
// used as a field and as a local variable all over the codebase
struct Data {
int a,b,c;
 int total() {
 return a + b + c;
}
}

// has a Data as one of its members but never escapes a pointer to it
class Bob {
private:
Data _dat;
 public:
int currentTotal() {
return _dat.total();
 }
}
--------
Now, as part of my multithreaded refactor, I need to make Bob synchronized,
but that means the Data field inside it is shared, which means I can no
longer call the total() method in currentTotal().
To fix this, I could make Data synchronized as well, but Data is used all
over the codebase, most of the time as a local variable inside a function.
In my particular case, I see this a lot with a struct that represents a
location, which is just 2 bytes in my codebase, so adding a monitor would
more than double the size, and the locking overhead would be completely
unnecessary.
If I don't want to make it synchronized, I could just cast away shared
everywhere I use it as a field, which looks ugly and is confusing when I
look at the codebase.
If I don't want to cast away shared, I could just make Data shared and
assume that the owner will make sure it's not shared improperly, but at
this point I've disabled all help the type system could provide me.

Firstly, according to TDPL:
--------
For synchronized methods:
"Maybe not very intuitively, the temporary nature of synchronized entails
the rule that no address of a field can escape a synchronized address. If
that happened, some other portion of the code could access some data beyond
the temporary protection conferred by method-level synchronization."

For synchronized classes:
=95 All numeric types are not shared (they have no tail) so they can be
manipulated normally.
=95 Array fields declared with type T [ ] receive type shared(T) [ ] ; that
is, the head (the slice limits) is not shared and the tail (the contents of
the array) remains shared.
=95 Pointer fields declared with type T* receive type shared(T)*; that is,
the head (the pointer itself) is not shared and the tail (the pointed-to
data) remains shared.
=95 Class fields declared with type T receive type shared(T). Classes are
automatically by-reference, so they're "all tail."
These rules apply on top of the no-escape rule described in the previous
section.
One direct consequence is that operations affecting direct fields of the
object can be freely reordered and optimized inside the method, as if
sharing has been temporarily suspended for them=97which is exactly what
synchronized does.
--------

At a first glance, it seems like the first rule should apply for structs
(which would mean it should address "value types"), but it can't because a
struct could contain a reference to another object, and that reference
should be transitively shared. Typing a struct as shared if it contains a
reference and unshared otherwise would just be confusing, but this use case
is one that the language does not currently address in a satisfying way.

When I flag a type as shared, all instances of it are forced to become
shared, but the compiler assumes that the programmer has properly
synchronized things such that sharing instances of the type is safe. Why,
then, can I not force the compiler to assume I've properly synchronized
things for a field of a class? In this case, the effect would be the
opposite - the field wouldn't be flagged as shared, but supposing we had
such a keyword, it would act as a much more limited version of the "shared"
keyword because I'm only forcing the compiler to assume I've done things
properly within the context of a class.
The keyword would have to be restricted such that it could only be applied
to private fields, and the compiler would continue to enforce (as much as
is reasonable) that the address of the field does not escape.

I believe that this case of data sharing will appear and frustrate
programmers in almost any multithreaded program, and that finding a
satisfying solution to allow the language to provide as many guarantees as
possible is worthwhile.

Any thoughts?

--bcaec52be88d4cb54304b2325095
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

About a month or so ago, I started trying to convert a codebase I've be=
en working on into a multithreaded system, and I've been hitting this s=
ort of thing over and over:<div>--------</div><div>// used as a field and a=
s a local variable all over the codebase</div>

<div><div>struct Data {</div><div><span class=3D"Apple-tab-span" style=3D"w=
hite-space:pre">	</span>int a,b,c;</div><div><span class=3D"Apple-tab-span"=
 style=3D"white-space:pre">	</span></div><div><span class=3D"Apple-tab-span=
" style=3D"white-space:pre">	</span>int total() {</div>

<div><span class=3D"Apple-tab-span" style=3D"white-space:pre">		</span>retu=
rn a + b + c;</div><div><span class=3D"Apple-tab-span" style=3D"white-space=
:pre">	</span>}</div><div>}</div><div><br></div><div>// has a Data as one o=
f its members but never escapes a pointer to it</div>

<div>class Bob {</div><div><span class=3D"Apple-tab-span" style=3D"white-sp=
ace:pre">	</span>private:</div><div><span class=3D"Apple-tab-span" style=3D=
"white-space:pre">	</span>Data _dat;</div><div><span class=3D"Apple-tab-spa=
n" style=3D"white-space:pre">	</span></div>

<div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>publi=
c:</div><div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	</sp=
an>int currentTotal() {</div><div><span class=3D"Apple-tab-span" style=3D"w=
hite-space:pre">		</span>return _dat.total();</div>

<div><span class=3D"Apple-tab-span" style=3D"white-space:pre">	</span>}</di=
v><div>}</div></div><div>--------</div><div>Now, as part of my multithreade=
d refactor, I need to make Bob synchronized, but that means the Data field =
inside it is shared, which means I can no longer call the total() method in=
 currentTotal().</div>

<div>To fix this, I could make Data synchronized as well, but Data is used =
all over the codebase, most of the time as a local variable inside a functi=
on. In my particular case, I see this a lot with a struct that represents a=
 location, which is just 2 bytes in my codebase, so adding a monitor would =
more than double the size, and the locking overhead would be completely unn=
ecessary.</div>

<div>If I don&#39;t want to make it synchronized, I could just cast away sh=
ared everywhere I use it as a field, which looks ugly and is confusing when=
 I look at the codebase.</div><div>If I don&#39;t want to cast away shared,=
 I could just make Data shared and assume that the owner will make sure it&=
#39;s not shared improperly, but at this point I&#39;ve disabled all help t=
he type system could provide me.</div>

<div><br></div><div>Firstly, according to TDPL:</div><div>--------</div><di=
v>For synchronized methods:</div><div>&quot;Maybe not very intuitively, the=
 temporary nature of synchronized entails the rule that=A0no address of a f=
ield can escape a synchronized address. If that happened, some other=A0port=
ion of the code could access some data beyond the temporary protection conf=
erred=A0by method-level synchronization.&quot;</div>

<div><br></div><div>For synchronized classes:</div><div><div>=95=A0All nume=
ric types are not shared (they have no tail) so they can be manipulated=A0n=
ormally.</div><div>=95 Array fields declared with type T [ ] receive type s=
hared(T) [ ] ; that is, the head (the=A0slice limits) is not shared and the=
 tail (the contents of the array) remains shared.</div>

<div>=95 Pointer fields declared with type T* receive type shared(T)*; that=
 is, the head (the=A0pointer itself) is not shared and the tail (the pointe=
d-to data) remains shared.</div><div>=95 Class fields declared with type T =
receive type shared(T). Classes are automatically=A0by-reference, so they&#=
39;re &quot;all tail.&quot;</div>

<div>These rules apply on top of the no-escape rule described in the previo=
us section.</div><div>One direct consequence is that operations affecting d=
irect fields of the object can be=A0freely reordered and optimized inside t=
he method, as if sharing has been temporarily=A0suspended for them=97which =
is exactly what synchronized does.</div>

</div><div>--------</div><div><br></div><div>At a first glance, it seems li=
ke the first rule should apply for structs (which would mean it should addr=
ess &quot;value types&quot;), but it can&#39;t because a struct could conta=
in a reference to another object, and that reference should be transitively=
 shared. Typing a struct as shared if it contains a reference and unshared =
otherwise would just be confusing, but this use case is one that the langua=
ge does not currently address in a satisfying way.</div>

<div><br></div><div>When I flag a type as shared, all instances of it are f=
orced to become shared, but the compiler assumes that the programmer has pr=
operly synchronized things such that sharing instances of the type is safe.=
 Why, then, can I not force the compiler to assume I&#39;ve properly synchr=
onized things for a field of a class? In this case, the effect would be the=
 opposite - the field wouldn&#39;t be flagged as shared, but supposing we h=
ad such a keyword, it would act as a much more limited version of the &quot=
;shared&quot; keyword because I&#39;m only forcing the compiler to assume I=
&#39;ve done things properly within the context of a class.</div>

<div>The keyword would have to be restricted such that it could only be app=
lied to private fields,=A0and the compiler would continue to enforce (as mu=
ch as is reasonable) that the address of the field does not escape.</div>

<div><br></div><div>I believe that this case of data sharing will appear an=
d frustrate programmers in almost any multithreaded program, and that findi=
ng a satisfying solution to allow the language to provide as many guarantee=
s as possible is worthwhile.</div>

<div><br></div><div>Any thoughts?</div>

--bcaec52be88d4cb54304b2325095--
Nov 20 2011