www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - GC API: What can change for precise scanning?

reply "dsimcha" <dsimcha yahoo.com> writes:
Now that the compiler infrastructure has been implemented, I've 
gotten busy figuring out how to make D's default GC precise.  As 
a first attempt, I think I'm going to adapt my original solution 
from http://d.puremagic.com/issues/show_bug.cgi?id=3463 since 
it's simple and it works except that there previously was no 
clean way to get the offset info into the GC.  As Walter pointed 
out in another thread, the GCInfo template is allowed to 
instantiate to data instead of a function.  IMHO unless/until 
major architectural changes to the GC are made that require a 
function pointer, there's no point in adding this indirection.

I started working on this and I ran into a roadblock.  I need to 
know what parts of the GC API are allowed to change, and discuss 
how to abstract away the implementation of it from the GC API.  I 
assume the stuff in core.memory needs to stay mostly the same, 
though I guess we would need to add a setType() function that 
takes a pointer into a block of memory and a TypeInfo object and 
changes how the GC interprets the bits in the block.

In gc.d, we define a bunch of extern(C) functions and the proxy 
thing.  Since we've given up on the idea of swapping precise GCs 
at link time, can I just rip out all this unnecesary indirection? 
  If not, is it ok to change some of these signatures?  I 
definitely want to avoid allocating (requiring the GC lock) and 
then calling a function to set the type (requiring another lock 
acquisition) so the signature of malloc(), etc. needs to change 
somewhere.

More generally, what is the intended way to get GCInfo pointers 
from TypeInfo into the guts of the GC where they can be acted on?
Apr 17 2012
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 18/04/2012 02:36, dsimcha a écrit :
 Now that the compiler infrastructure has been implemented, I've gotten
 busy figuring out how to make D's default GC precise. As a first
 attempt, I think I'm going to adapt my original solution from
 http://d.puremagic.com/issues/show_bug.cgi?id=3463 since it's simple and
 it works except that there previously was no clean way to get the offset
 info into the GC. As Walter pointed out in another thread, the GCInfo
 template is allowed to instantiate to data instead of a function. IMHO
 unless/until major architectural changes to the GC are made that require
 a function pointer, there's no point in adding this indirection.

 I started working on this and I ran into a roadblock. I need to know
 what parts of the GC API are allowed to change, and discuss how to
 abstract away the implementation of it from the GC API. I assume the
 stuff in core.memory needs to stay mostly the same, though I guess we
 would need to add a setType() function that takes a pointer into a block
 of memory and a TypeInfo object and changes how the GC interprets the
 bits in the block.

 In gc.d, we define a bunch of extern(C) functions and the proxy thing.
 Since we've given up on the idea of swapping precise GCs at link time,
 can I just rip out all this unnecesary indirection? If not, is it ok to
 change some of these signatures? I definitely want to avoid allocating
 (requiring the GC lock) and then calling a function to set the type
 (requiring another lock acquisition) so the signature of malloc(), etc.
 needs to change somewhere.

 More generally, what is the intended way to get GCInfo pointers from
 TypeInfo into the guts of the GC where they can be acted on?
I guess that the flag to indicate if some piece of memory may have pointer can go away. I think you certainly can remove all indirection. Additionally, I wonder why most of theses functions are extern(C).
Apr 18 2012
next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <xtzgzorex gmail.com> writes:
On 18-04-2012 11:56, deadalnix wrote:
 Le 18/04/2012 02:36, dsimcha a écrit :
 Now that the compiler infrastructure has been implemented, I've gotten
 busy figuring out how to make D's default GC precise. As a first
 attempt, I think I'm going to adapt my original solution from
 http://d.puremagic.com/issues/show_bug.cgi?id=3463 since it's simple and
 it works except that there previously was no clean way to get the offset
 info into the GC. As Walter pointed out in another thread, the GCInfo
 template is allowed to instantiate to data instead of a function. IMHO
 unless/until major architectural changes to the GC are made that require
 a function pointer, there's no point in adding this indirection.

 I started working on this and I ran into a roadblock. I need to know
 what parts of the GC API are allowed to change, and discuss how to
 abstract away the implementation of it from the GC API. I assume the
 stuff in core.memory needs to stay mostly the same, though I guess we
 would need to add a setType() function that takes a pointer into a block
 of memory and a TypeInfo object and changes how the GC interprets the
 bits in the block.

 In gc.d, we define a bunch of extern(C) functions and the proxy thing.
 Since we've given up on the idea of swapping precise GCs at link time,
 can I just rip out all this unnecesary indirection? If not, is it ok to
 change some of these signatures? I definitely want to avoid allocating
 (requiring the GC lock) and then calling a function to set the type
 (requiring another lock acquisition) so the signature of malloc(), etc.
 needs to change somewhere.

 More generally, what is the intended way to get GCInfo pointers from
 TypeInfo into the guts of the GC where they can be acted on?
I guess that the flag to indicate if some piece of memory may have pointer can go away.
+1. This is useless if we're going to use bitmaps or similar.
 I think you certainly can remove all indirection. Additionally, I wonder
 why most of theses functions are extern(C).
-- - Alex
Apr 18 2012
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/18/2012 2:56 AM, deadalnix wrote:
 I think you certainly can remove all indirection. Additionally, I wonder why
 most of theses functions are extern(C).
The purpose of the indirection is so that DLLs in Windows can share a gc instance, rather than having two instances fight each other.
Apr 18 2012
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Apr 18, 2012, at 2:56 AM, deadalnix wrote:
=20
 I think you certainly can remove all indirection. Additionally, I =
wonder why most of theses functions are extern(C). So the GC implementation is opaque and the GC can therefore be chosen at = link-time. Similar to how the compiler runtime code hides behind a raft = of extern=A9 functions.=
Apr 18 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 18/04/2012 20:53, Sean Kelly a écrit :
 On Apr 18, 2012, at 2:56 AM, deadalnix wrote:
 I think you certainly can remove all indirection. Additionally, I wonder why
most of theses functions are extern(C).
So the GC implementation is opaque and the GC can therefore be chosen at link-time. Similar to how the compiler runtime code hides behind a raft of extern© functions.
I know, but this is now impossible anyway because of the modification of TypeInfo anyway.
Apr 18 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Apr 18, 2012, at 4:02 PM, deadalnix wrote:

 Le 18/04/2012 20:53, Sean Kelly a =E9crit :
 On Apr 18, 2012, at 2:56 AM, deadalnix wrote:
=20
 I think you certainly can remove all indirection. Additionally, I =
wonder why most of theses functions are extern(C).
=20
 So the GC implementation is opaque and the GC can therefore be chosen =
at link-time. Similar to how the compiler runtime code hides behind a = raft of extern=A9 functions.
=20
 I know, but this is now impossible anyway because of the modification =
of TypeInfo anyway. I'm not sure I follow. Are you saying the change breaks having the GC = behind extern C functions? How?
Apr 18 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 19/04/2012 02:08, Sean Kelly a écrit :
 On Apr 18, 2012, at 4:02 PM, deadalnix wrote:

 Le 18/04/2012 20:53, Sean Kelly a écrit :
 On Apr 18, 2012, at 2:56 AM, deadalnix wrote:
 I think you certainly can remove all indirection. Additionally, I wonder why
most of theses functions are extern(C).
So the GC implementation is opaque and the GC can therefore be chosen at link-time. Similar to how the compiler runtime code hides behind a raft of extern© functions.
I know, but this is now impossible anyway because of the modification of TypeInfo anyway.
I'm not sure I follow. Are you saying the change breaks having the GC behind extern C functions? How?
No it doesn't break GC behind C functions. It break the possibility of changing GC at link time, because different GC needs different data generated in TypeInfo. So the indirection become useless.
Apr 18 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 04/19/2012 02:21 AM, deadalnix wrote:
 Le 19/04/2012 02:08, Sean Kelly a écrit :
 On Apr 18, 2012, at 4:02 PM, deadalnix wrote:

 Le 18/04/2012 20:53, Sean Kelly a écrit :
 On Apr 18, 2012, at 2:56 AM, deadalnix wrote:
 I think you certainly can remove all indirection. Additionally, I
 wonder why most of theses functions are extern(C).
So the GC implementation is opaque and the GC can therefore be chosen at link-time. Similar to how the compiler runtime code hides behind a raft of extern© functions.
I know, but this is now impossible anyway because of the modification of TypeInfo anyway.
I'm not sure I follow. Are you saying the change breaks having the GC behind extern C functions? How?
No it doesn't break GC behind C functions. It break the possibility of changing GC at link time, because different GC needs different data generated in TypeInfo. So the indirection become useless.
The indirection is there for better shared library support.
Apr 18 2012
prev sibling next sibling parent =?UTF-8?B?QWxleCBSw7hubmUgUGV0ZXJzZW4=?= <xtzgzorex gmail.com> writes:
On 18-04-2012 02:36, dsimcha wrote:
 Now that the compiler infrastructure has been implemented, I've gotten
 busy figuring out how to make D's default GC precise. As a first
 attempt, I think I'm going to adapt my original solution from
 http://d.puremagic.com/issues/show_bug.cgi?id=3463 since it's simple and
 it works except that there previously was no clean way to get the offset
 info into the GC. As Walter pointed out in another thread, the GCInfo
 template is allowed to instantiate to data instead of a function. IMHO
 unless/until major architectural changes to the GC are made that require
 a function pointer, there's no point in adding this indirection.

 I started working on this and I ran into a roadblock. I need to know
 what parts of the GC API are allowed to change, and discuss how to
 abstract away the implementation of it from the GC API. I assume the
 stuff in core.memory needs to stay mostly the same, though I guess we
 would need to add a setType() function that takes a pointer into a block
 of memory and a TypeInfo object and changes how the GC interprets the
 bits in the block.

 In gc.d, we define a bunch of extern(C) functions and the proxy thing.
 Since we've given up on the idea of swapping precise GCs at link time,
 can I just rip out all this unnecesary indirection? If not, is it ok to
 change some of these signatures? I definitely want to avoid allocating
 (requiring the GC lock) and then calling a function to set the type
 (requiring another lock acquisition) so the signature of malloc(), etc.
 needs to change somewhere.

 More generally, what is the intended way to get GCInfo pointers from
 TypeInfo into the guts of the GC where they can be acted on?
This is not specifically an answer to your question, but my opinion on this subject is that altering the GC API without a deprecation process or similar is fine. This is a core component of the runtime, and I don't think expecting a stable API for something like a garbage collector of all things is reasonable in the first place. That said, I do think functionality has to be maintained. That is, if the GC can do something in a previous version, it should be able to do it in the subsequent version too. *How* that something is achieved in the subsequent version isn't so important, as long as the functionality is *there* in some way. -- - Alex
Apr 18 2012
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/17/2012 5:36 PM, dsimcha wrote:
 Now that the compiler infrastructure has been implemented, I've gotten busy
 figuring out how to make D's default GC precise. As a first attempt, I think
I'm
 going to adapt my original solution from
 http://d.puremagic.com/issues/show_bug.cgi?id=3463 since it's simple and it
 works except that there previously was no clean way to get the offset info into
 the GC. As Walter pointed out in another thread, the GCInfo template is allowed
 to instantiate to data instead of a function. IMHO unless/until major
 architectural changes to the GC are made that require a function pointer,
 there's no point in adding this indirection.

 I started working on this and I ran into a roadblock. I need to know what parts
 of the GC API are allowed to change, and discuss how to abstract away the
 implementation of it from the GC API. I assume the stuff in core.memory needs
to
 stay mostly the same, though I guess we would need to add a setType() function
 that takes a pointer into a block of memory and a TypeInfo object and changes
 how the GC interprets the bits in the block.

 In gc.d, we define a bunch of extern(C) functions and the proxy thing. Since
 we've given up on the idea of swapping precise GCs at link time, can I just rip
 out all this unnecesary indirection? If not, is it ok to change some of these
 signatures? I definitely want to avoid allocating (requiring the GC lock) and
 then calling a function to set the type (requiring another lock acquisition) so
 the signature of malloc(), etc. needs to change somewhere.

 More generally, what is the intended way to get GCInfo pointers from TypeInfo
 into the guts of the GC where they can be acted on?
1. I would not try to redesign everything and do precise gc at the same time. 2. The purpose of the indirection is to support DLLs so that the different DLLs can share an instance. 3. The reason for function pointers for marking is so that the marking code can be customized and directly inlined, rather than decoding a table. It costs one code indirection, but after that it cannot be beaten for speed.
Apr 18 2012
parent reply Sean Kelly <sean invisibleduck.org> writes:
On Apr 18, 2012, at 1:13 PM, Walter Bright wrote:

 On 4/17/2012 5:36 PM, dsimcha wrote:
 Now that the compiler infrastructure has been implemented, I've =
gotten busy
 figuring out how to make D's default GC precise. As a first attempt, =
I think I'm
 going to adapt my original solution from
 http://d.puremagic.com/issues/show_bug.cgi?id=3D3463 since it's =
simple and it
 works except that there previously was no clean way to get the offset =
info into
 the GC. As Walter pointed out in another thread, the GCInfo template =
is allowed
 to instantiate to data instead of a function. IMHO unless/until major
 architectural changes to the GC are made that require a function =
pointer,
 there's no point in adding this indirection.
=20
 I started working on this and I ran into a roadblock. I need to know =
what parts
 of the GC API are allowed to change, and discuss how to abstract away =
the
 implementation of it from the GC API. I assume the stuff in =
core.memory needs to
 stay mostly the same, though I guess we would need to add a setType() =
function
 that takes a pointer into a block of memory and a TypeInfo object and =
changes
 how the GC interprets the bits in the block.
=20
 In gc.d, we define a bunch of extern(C) functions and the proxy =
thing. Since
 we've given up on the idea of swapping precise GCs at link time, can =
I just rip
 out all this unnecesary indirection? If not, is it ok to change some =
of these
 signatures? I definitely want to avoid allocating (requiring the GC =
lock) and
 then calling a function to set the type (requiring another lock =
acquisition) so
 the signature of malloc(), etc. needs to change somewhere.
=20
 More generally, what is the intended way to get GCInfo pointers from =
TypeInfo
 into the guts of the GC where they can be acted on?
=20 1. I would not try to redesign everything and do precise gc at the =
same time.
=20
 2. The purpose of the indirection is to support DLLs so that the =
different DLLs can share an instance.
=20
 3. The reason for function pointers for marking is so that the marking =
code can be customized and directly inlined, rather than decoding a = table. It costs one code indirection, but after that it cannot be beaten = for speed. Leandro's GC (CDGC) is already set up to support precise scanning. It's = in the Druntime git repository, but lacks the features added to the = Druntime GC compared to the Tango GC on which CDGC is based. Still, it = may be easier to update CDGC based on a diff between the Druntime and = Tango GC than it would to add precise scanning to the GC Druntime = currently uses. Worth a look if anyone is interested anyway.
Apr 18 2012
parent dsimcha <dsimcha yahoo.com> writes:
On 4/18/2012 6:46 PM, Sean Kelly wrote:
 Leandro's GC (CDGC) is already set up to support precise scanning.  It's in
the Druntime git repository, but lacks the features added to the Druntime GC
compared to the Tango GC on which CDGC is based.  Still, it may be easier to
update CDGC based on a diff between the Druntime and Tango GC than it would to
add precise scanning to the GC Druntime currently uses.  Worth a look if anyone
is interested anyway.
Or, failing that, I can look at it to get ideas about how to handle various annoying plumbing issues. The plumbing issues (i.e. getting the GCInfo pointers from the allocation routines into the guts of the GC) are actually the hard part of this project. Once the GC has the the GCInfo pointer, making it use that for precise scanning is trivial in that I've done it before and remember roughly how I did it.
Apr 18 2012
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Apr 17, 2012, at 5:36 PM, dsimcha wrote:
=20
 More generally, what is the intended way to get GCInfo pointers from =
TypeInfo into the guts of the GC where they can be acted on? I just resurrected an old thread titled "Proposed changes to the GC = interface" in the Druntime mailing list. Perhaps we could pick up = discussion there?
Apr 18 2012