www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - pinned classes

reply bearophile <bearophileHUGS lycos.com> writes:
Thinking more about some of the things I've recently written to Mike S, I think
the situation of the D GC can be improved not teaching the D type system how to
tell apart three types of pointers, but introducing the  pinned for classes:

 pinned class Foo {}

Unpinned memory can be moved, allowing a generational moving GC, that's
efficient.

All objects instantiated from a unpinned class are unpinned. This is a little
limiting (you can think of allowing both pinned and unpinned instances) but
this keeps the situation simpler for the compiler and the programmer.

With unpinned classed Java/C# programmers can program in D in a style similar
to the one they are used too in their languages. This is good.

Classes are unpinned on default (the opposite of the current situation) to
maximize the amount of unpinned objects.

The  pinned attribute can't be used with structs and enums, they are always
pinned becasue Java programmers don't use them, they are usually used for
performance in a lower level way, and because they don't have a virtual table
pointer that the GC can use, etc.

Normal unpinned classes can't contain pointers to their fields or to unpinned
memory, in a transitive way. They can contain pointers to pinned memory.

In system (unsafe) modules you can of course cast a unpinned class referent to
a pointer, but this is nearly, because the GC can move the class in memory in
any moment. It can be useful for a short time if you disable the GC.

Pinned classes act as now, they can contain pointers to their fields too.

The GC can move around unpinned objects and must keep in place the pineed ones,
the GC has to modify the references to unpinned classes (references on the
stack, inside other objects, etc), to update them to their new positions.

Probably enums can't contain references to unpinned memory, to keep things tidy.

This can be a compile error, prbably Bar too has to be unpinned:
class Foo {}
 pinned class Bar: Foo {}

I'm sure I'm forgetting several things :-)

Bye,
bearophile
Mar 31 2010
next sibling parent reply Justin Spahr-Summers <Justin.SpahrSummers gmail.com> writes:
On Wed, 31 Mar 2010 22:59:08 -0400, bearophile 
<bearophileHUGS lycos.com> wrote:
 
 Thinking more about some of the things I've recently written to Mike S, I
think the situation of the D GC can be improved not teaching the D type system
how to tell apart three types of pointers, but introducing the  pinned for
classes:
 
  pinned class Foo {}
 
 Unpinned memory can be moved, allowing a generational moving GC, that's
efficient.
 
 All objects instantiated from a unpinned class are unpinned. This is a little
limiting (you can think of allowing both pinned and unpinned instances) but
this keeps the situation simpler for the compiler and the programmer.
 
 With unpinned classed Java/C# programmers can program in D in a style similar
to the one they are used too in their languages. This is good.
 
 Classes are unpinned on default (the opposite of the current situation) to
maximize the amount of unpinned objects.
 
 The  pinned attribute can't be used with structs and enums, they are always
pinned becasue Java programmers don't use them, they are usually used for
performance in a lower level way, and because they don't have a virtual table
pointer that the GC can use, etc.
 
 Normal unpinned classes can't contain pointers to their fields or to unpinned
memory, in a transitive way. They can contain pointers to pinned memory.
 
 In system (unsafe) modules you can of course cast a unpinned class referent to
a pointer, but this is nearly, because the GC can move the class in memory in
any moment. It can be useful for a short time if you disable the GC.
 
 Pinned classes act as now, they can contain pointers to their fields too.
 
 The GC can move around unpinned objects and must keep in place the pineed
ones, the GC has to modify the references to unpinned classes (references on
the stack, inside other objects, etc), to update them to their new positions.
 
 Probably enums can't contain references to unpinned memory, to keep things
tidy.
 
 This can be a compile error, prbably Bar too has to be unpinned:
 class Foo {}
  pinned class Bar: Foo {}
 
 I'm sure I'm forgetting several things :-)
 
 Bye,
 bearophile

I think the D2 spec puts restrictions on what you can do with GC- allocated pointers (can't convert them to integers, can't perform arithmetic on them outside of their bounds, etc.), and I think they're restrictive enough that a copying garbage collector could work with no changes to compliant code. - Justin Spahr-Summers
Mar 31 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Justin Spahr-Summers:
 I think the D2 spec puts restrictions on what you can do with GC-
 allocated pointers (can't convert them to integers, can't perform 
 arithmetic on them outside of their bounds, etc.), and I think they're 
 restrictive enough that a copying garbage collector could work with no 
 changes to compliant code.

Without annotations here the compiler has to find by itself that instances of this class is harder to move: import std.c.stdio: printf; class Foo { int x; int* ptrx; this(int xx) { this.x = xx; this.ptrx = &(this.x); } } void main() { auto f = new Foo(10); printf("%d %d\n", f.x, *f.ptrx); auto p = f.ptrx; } Bye, bearophile
Apr 01 2010
parent reply Justin Spahr-Summers <Justin.SpahrSummers gmail.com> writes:
On Thu, 01 Apr 2010 05:27:55 -0400, bearophile 
<bearophileHUGS lycos.com> wrote:
 
 Justin Spahr-Summers:
 I think the D2 spec puts restrictions on what you can do with GC-
 allocated pointers (can't convert them to integers, can't perform 
 arithmetic on them outside of their bounds, etc.), and I think they're 
 restrictive enough that a copying garbage collector could work with no 
 changes to compliant code.

Without annotations here the compiler has to find by itself that instances of this class is harder to move: import std.c.stdio: printf; class Foo { int x; int* ptrx; this(int xx) { this.x = xx; this.ptrx = &(this.x); } } void main() { auto f = new Foo(10); printf("%d %d\n", f.x, *f.ptrx); auto p = f.ptrx; } Bye, bearophile

But shouldn't the GC know the size of Foo instances? It seems like it should be able to rewrite any GC-managed pointers that point to 'f' or anywhere inside it.
Apr 01 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Justin Spahr-Summers:

 But shouldn't the GC know the size of Foo instances?

Yes, if Foo is a class instance.
It seems like it should be able to rewrite any GC-managed pointers that point
to 'f' or anywhere inside it.<

Maybe it can be done, but in a system language it can become a mess, better to not go there. That's why in the original post I have suggested to allow to unpin only classes, so only a subset of references point to unpinned memory. All pointers, and part of the class references, point to pinned memory. This keeps this thing much simpler, and I think it's useful enough still. Java programmers don't use pointers, and pointers are meant for more manual management. Bye, bearophile
Apr 01 2010
prev sibling parent reply div0 <div0 users.sourceforge.net> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Justin Spahr-Summers wrote:
 On Wed, 31 Mar 2010 22:59:08 -0400, bearophile 
 Bye,
 bearophile

I think the D2 spec puts restrictions on what you can do with GC- allocated pointers (can't convert them to integers, can't perform arithmetic on them outside of their bounds, etc.), and I think they're restrictive enough that a copying garbage collector could work with no changes to compliant code. - Justin Spahr-Summers

The trouble with a moving garbage collector is that you have to be able to accurately scan the stacks of all threads. This is difficult to do with a language that can arbitrarily call into functions provided by different languages for which the compiler won't and can't have stack layout info. It's especially a pain on windows, as D code is called from window procedures, so most code is already using a stack which can't be scanned. Might be more doable on linux, xlib has a vastly better design than win32. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFLtN7XT9LetA9XoXwRAsD9AJ49Ml2eW3JgW4RXL3qvJmqQZDpAVgCfUodR 2tAAm34SNHuoiRv82+4jPyQ= =CBev -----END PGP SIGNATURE-----
Apr 01 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
div0:
 The trouble with a moving garbage collector is that you have to be able
 to accurately scan the stacks of all threads. This is difficult to do
 with a language that can arbitrarily call into functions provided by
 different languages for which the compiler won't and can't have stack
 layout info.

All pointers are to pinned memory. You can't give or receive an "unpinned reference" to an external function written in other languages. Only the D GC manages references to unpinned classes. So, can you explain me what problems you see? Bye, bearophile
Apr 01 2010
parent reply div0 <div0 users.sourceforge.net> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

bearophile wrote:
 div0:
 The trouble with a moving garbage collector is that you have to be able
 to accurately scan the stacks of all threads. This is difficult to do
 with a language that can arbitrarily call into functions provided by
 different languages for which the compiler won't and can't have stack
 layout info.

All pointers are to pinned memory. You can't give or receive an "unpinned reference" to an external function written in other languages. Only the D GC

 to unpinned classes. So, can you explain me what problems you see?
 
 Bye,
 bearophile

Well on windows, in a single threaded application your stack usually looks like this: ==== Stack start D runtime start up - - - - - - void main(...) - your application stack - - - - - - DispatchMessage - win32 call * * * * windowProc - your D application window callback proc - - - - - - - - SendMessage - win32 call * * * * windowProc - another D application window callback proc - - - - - - - - Where - is D stack and * is unknown stack (usually win32 functions) Any part of the stack used by D code might and probably will have pointers to unpinned objects. In order to move an object you must be able to update all pointers to it, otherwise obviously your application is going to crash. There is no reasonable way to known anything about the parts of the stack indicated by *, including really where they start or end once you start talking about nested callbacks. You can't easily or reasonably find the bits of the stack used by D code, so you can't scan the stack, and therefore you can't move any object. Throw threading into the mix and your stacks get even more complicated. This isn't strictly a windows problem though. There are quite a few C libraries which use a callback mechanism and they'll all suffer from the same problem. This is part of the reason why you have to jump through hoops to call native code from Java or C#. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFLtQ6xT9LetA9XoXwRArr/AJ9vjoD4T2FwXpLELNOeN1daQkmUmwCfWrju LmqQCmws/TX8wOSNaNh7I6A= =VH1s -----END PGP SIGNATURE-----
Apr 01 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
div0:
 You can't easily or reasonably find the bits of the stack used by D
 code, so you can't scan the stack, and therefore you can't move any object.
 
 Throw threading into the mix and your stacks get even more complicated.

Thank you for your explanation, I have to learn more. In two years I hope to start giving useful ideas :-) Bye, bearophile
Apr 01 2010
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
On 4/1/2010 15:22, div0 wrote:
 You can't easily or reasonably find the bits of the stack used by D
 code, so you can't scan the stack, and therefore you can't move any object.

You could, if D was modified to register the parts of the stack that it uses in some sort of per-thread stack registry. It's not cheap, but it works, and it can be optimized until it's cheap enough. -- Rainer Deyke - rainerd eldwood.com
Apr 02 2010
parent div0 <div0 users.sourceforge.net> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rainer Deyke wrote:
 On 4/1/2010 15:22, div0 wrote:
 You can't easily or reasonably find the bits of the stack used by D
 code, so you can't scan the stack, and therefore you can't move any object.

You could, if D was modified to register the parts of the stack that it uses in some sort of per-thread stack registry. It's not cheap, but it works, and it can be optimized until it's cheap enough.

yeah, I was thinking of adding some sort of callback attribute/property. A callback decorated function would have preamble to register the stack address with the runtime. Though you would need to modify dmd to generate accurate stack frame maps as well and make that info available to the runtime. That's no small amount of work. I'd want two types of callback though. One which does the registration and another which says that the callback function won't use unpinned objects so the registration is not necessary. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFLtnOcT9LetA9XoXwRAjzHAKDD7PpnHSRqi/M/mGAM1SmGX4IWQgCfVlnn aqfeUe6fAHyht5z4vc4JZbA= =b08/ -----END PGP SIGNATURE-----
Apr 02 2010
prev sibling parent Justin Spahr-Summers <Justin.SpahrSummers gmail.com> writes:
On Thu, 01 Apr 2010 18:58:47 +0100, div0 <div0 users.sourceforge.net> 
wrote:
 
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Justin Spahr-Summers wrote:
 On Wed, 31 Mar 2010 22:59:08 -0400, bearophile 
 Bye,
 bearophile

I think the D2 spec puts restrictions on what you can do with GC- allocated pointers (can't convert them to integers, can't perform arithmetic on them outside of their bounds, etc.), and I think they're restrictive enough that a copying garbage collector could work with no changes to compliant code. - Justin Spahr-Summers

The trouble with a moving garbage collector is that you have to be able to accurately scan the stacks of all threads. This is difficult to do with a language that can arbitrarily call into functions provided by different languages for which the compiler won't and can't have stack layout info. It's especially a pain on windows, as D code is called from window procedures, so most code is already using a stack which can't be scanned. Might be more doable on linux, xlib has a vastly better design than win32. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFLtN7XT9LetA9XoXwRAsD9AJ49Ml2eW3JgW4RXL3qvJmqQZDpAVgCfUodR 2tAAm34SNHuoiRv82+4jPyQ= =CBev -----END PGP SIGNATURE-----

My knowledge of garbage collection techniques isn't as extensive as I'd like it to be. My previous reply was mostly an assertion that the D spec is strict enough about GC-managed pointers that more advanced/precise garbage collectors SHOULD theoretically be possible, although there may be platform-imposed obstacles like what you mentioned. For instance, I think the standard is specific enough that maybe GC- managed "pointers" could be implemented as indices into a managed-memory lookup table, using an actual pointer when calling extern (C) functions or some such. I really doubt this is an optimal implementation, but my understanding is that it should be possible without any code changes to compliant D programs.
Apr 01 2010
prev sibling parent Justin Johansson <no spam.com> writes:
Sounds like a revisitation of Microsoft extensions to C++ a number of years
(maybe a decade) ago.  I distinctly recall a short-lived new version of Visual
C++
christening a declspec like "__pinned" to C++ pointers and/or class declarations
in respect of their up and coming garbage-collected platform.  It was not called
.Net then but from memory, this particular version of Visual C++ was the
precursor
to C# / .Net and then the next version of VC++ promptly forgot about "__pinned".

A similar MS C++ declspec, "__based", happened a decade or two before "__pinned"
in relation to a pointer being just an offset into a 64KB block in the old
80286 segment/offset
memory model.

If history were to repeat itself, perhaps D will evolve to D#.

egards,
Justin Johansson



bearophile Wrote:

 Thinking more about some of the things I've recently written to Mike S, I
think the situation of the D GC can be improved not teaching the D type system
how to tell apart three types of pointers, but introducing the  pinned for
classes:
 
  pinned class Foo {}
 
 Unpinned memory can be moved, allowing a generational moving GC, that's
efficient.
 
 All objects instantiated from a unpinned class are unpinned. This is a little
limiting (you can think of allowing both pinned and unpinned instances) but
this keeps the situation simpler for the compiler and the programmer.
 
 With unpinned classed Java/C# programmers can program in D in a style similar
to the one they are used too in their languages. This is good.
 
 Classes are unpinned on default (the opposite of the current situation) to
maximize the amount of unpinned objects.
 
 The  pinned attribute can't be used with structs and enums, they are always
pinned becasue Java programmers don't use them, they are usually used for
performance in a lower level way, and because they don't have a virtual table
pointer that the GC can use, etc.
 
 Normal unpinned classes can't contain pointers to their fields or to unpinned
memory, in a transitive way. They can contain pointers to pinned memory.
 
 In system (unsafe) modules you can of course cast a unpinned class referent to
a pointer, but this is nearly, because the GC can move the class in memory in
any moment. It can be useful for a short time if you disable the GC.
 
 Pinned classes act as now, they can contain pointers to their fields too.
 
 The GC can move around unpinned objects and must keep in place the pineed
ones, the GC has to modify the references to unpinned classes (references on
the stack, inside other objects, etc), to update them to their new positions.
 
 Probably enums can't contain references to unpinned memory, to keep things
tidy.
 
 This can be a compile error, prbably Bar too has to be unpinned:
 class Foo {}
  pinned class Bar: Foo {}
 
 I'm sure I'm forgetting several things :-)
 
 Bye,
 bearophile

Apr 01 2010