www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Garbage collector

reply FlyTox <rox271 hotmail.com> writes:
Coming from C++, I'm not a big fan of Garbage Collection which I 
consider as an open door to poor program design. Well, may be I'm from 
the old school. Anyway, I try to evolve and had some thoughts about it.

As far as I understand, the gc is part of the language and will be 
nested in the compiled EXE. This basically means that the allocated 
memory will always expand till gc decides it should collect garbage (or 
when the user calls a xxxxCollect() function). This means that the D 
application will work well... to the detriment of all other tasks!


consider it as a memory server. This should enable all .NET applications 
to friendly share memory without competition. Of course we have the same 
problem as for D: non .NET applications will fight for memory with .NET 
but the idea is still attractive.

Could we have the D GC as a shared lib among all D apps?

Can we have some kind of GC tuning like "do not use more than x amount 
of memory" or at least have an idea of what the gc could free in order 
to decide when to call a xxxxxCollect function? Looks like we're out of 
control on GC, aren't we?
Apr 28 2004
next sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
True, the problem existed - at least in the old time of garbage collectors.
Anyway, a reasonable GC should always consider to collect before demanding
any new memory from the operating system.

If a GC were to collect everytime, before demanding more memory, this would
give you the same memory efficiency as memory allocation by hand. Putting
in a little bit of intelligence will avoid most of the unnecessary
collections while still keeping reasonable memory efficiency.

The main problem is, of course, that it is hard, to return unneeded
heap-space to the OS, since the heap will usually be rather fragmented, so
any program that needs much memory just for a short moment might bloat the
waste space for a long time. But that's not a problem of GC, but of
heap-management in general.

Introducing a shared management between programs will not work, because this
will break down all the protections of the programs against each other. If
there is just one heap, then the OS can do no memory protection. In .NET
and Java, the virtual machine can do protection on a per-object base. In D
this is not possible, since we have pointers that can be freely modified
and used.
Apr 28 2004
parent reply Ilya Minkov <minkov cs.tum.edu> writes:
Norbert Nemec schrieb:
 True, the problem existed - at least in the old time of garbage collectors.
 Anyway, a reasonable GC should always consider to collect before demanding
 any new memory from the operating system.
Plans are to guess optimal collection time not only when the memory is low, but also
 If a GC were to collect everytime, before demanding more memory, this would
 give you the same memory efficiency as memory allocation by hand. Putting
 in a little bit of intelligence will avoid most of the unnecessary
 collections while still keeping reasonable memory efficiency.
It would stop your computer for a fraction of a second on each memory allocation! Computers have plenty of memory these days, and it's getting cheaper, so it pays off to allocate as much memory as possible before doing a collection. BTW, time which is requiered for the syncrnized step of the collection, is dependant on the living set, but not on the amount of garbage. But it's heavy. And it only amortizes in time as compared to, say, reference counting when collecting as rarely as possible. In fact, i have seen a collector which does not seem to follow this principle. When memory is low, it scans every time or at least way too often. And if an application actually requieres more memory than the computer can give, it starts getting very, very slow, because else it would simply swap out the inactive (but living) part of the set, but it keeps scanning through all of application's data and requieres the OS to swap parts in and out! Garbage collection should be limited not to happen too often, and not to rescan "clean"/swappedout pages - that's why GC should be an OS service.
 Introducing a shared management between programs will not work, because this
 will break down all the protections of the programs against each other. If
 there is just one heap, then the OS can do no memory protection. In .NET
 and Java, the virtual machine can do protection on a per-object base. In D
 this is not possible, since we have pointers that can be freely modified
 and used.
No. Each program runs an own instance of a garbage-collector - be it a DLL or linked into code. To make them "talk" to each-other, some sort of inter-process communication is requiered anyway - which can be made safe since it doesn't put multiple applications into common adress space. However, i don't see what sort of information they should exchange? The number and memory agressivenes of other garbage collected programs? Since not all progrms are garbage colleted, and many programs use different colletors, but all have some sort of memory usage strategy, it is better to "guess" such things than to ask. Besides, to keep collection pauses unnoticable, it makes sense that multiple provably independant heaps are kept, which can then be scanned asyncronously. Making garbage collection an OS kernel service would boost it radically though, because dirty bits can be used. -eye
Apr 29 2004
parent J Anderson <REMOVEanderson badmama.com.au> writes:
Ilya Minkov wrote:

 Norbert Nemec schrieb:

 True, the problem existed - at least in the old time of garbage 
 collectors.
 Anyway, a reasonable GC should always consider to collect before 
 demanding
 any new memory from the operating system.
Plans are to guess optimal collection time not only when the memory is low, but also
 If a GC were to collect everytime, before demanding more memory, this 
 would
 give you the same memory efficiency as memory allocation by hand. 
 Putting
 in a little bit of intelligence will avoid most of the unnecessary
 collections while still keeping reasonable memory efficiency.
It would stop your computer for a fraction of a second on each memory allocation! Computers have plenty of memory these days, and it's getting cheaper, so it pays off to allocate as much memory as possible before doing a collection. BTW, time which is requiered for the syncrnized step of the collection, is dependant on the living set, but not on the amount of garbage. But it's heavy. And it only amortizes in time as compared to, say, reference counting when collecting as rarely as possible. In fact, i have seen a collector which does not seem to follow this principle. When memory is low, it scans every time or at least way too often. And if an application actually requieres more memory than the computer can give, it starts getting very, very slow, because else it would simply swap out the inactive (but living) part of the set, but it keeps scanning through all of application's data and requieres the OS to swap parts in and out! Garbage collection should be limited not to happen too often, and not to rescan "clean"/swappedout pages - that's why GC should be an OS service.
 Introducing a shared management between programs will not work, 
 because this
 will break down all the protections of the programs against each 
 other. If
 there is just one heap, then the OS can do no memory protection. In .NET
 and Java, the virtual machine can do protection on a per-object base. 
 In D
 this is not possible, since we have pointers that can be freely modified
 and used.
No. Each program runs an own instance of a garbage-collector - be it a DLL or linked into code. To make them "talk" to each-other, some sort of inter-process communication is requiered anyway - which can be made safe since it doesn't put multiple applications into common adress space. However, i don't see what sort of information they should exchange? The number and memory agressivenes of other garbage collected programs? Since not all progrms are garbage colleted, and many programs use different colletors, but all have some sort of memory usage strategy, it is better to "guess" such things than to ask. Besides, to keep collection pauses unnoticable, it makes sense that multiple provably independant heaps are kept, which can then be scanned asyncronously. Making garbage collection an OS kernel service would boost it radically though, because dirty bits can be used. -eye
Of course, each time you break the GC up into small steps it becomes less optimal. In some cases you want to get rid of the pause, so this is the best solution. In other causes you want the program to complete in the most optimal time. -- -Anderson: http://badmama.com.au/~anderson/
Apr 29 2004
prev sibling next sibling parent Andy Friesen <andy ikagames.com> writes:
FlyTox wrote:
 Coming from C++, I'm not a big fan of Garbage Collection which I 
 consider as an open door to poor program design. Well, may be I'm from 
 the old school. Anyway, I try to evolve and had some thoughts about it.
 
 As far as I understand, the gc is part of the language and will be 
 nested in the compiled EXE. This basically means that the allocated 
 memory will always expand till gc decides it should collect garbage (or 
 when the user calls a xxxxCollect() function). This means that the D 
 application will work well... to the detriment of all other tasks!
This is a good point; the virtual memory manager would almost certainly mitigate this by by swapping garbage to disk, though, so I don't know how much of an impact it incurs.

 consider it as a memory server. This should enable all .NET applications 
 to friendly share memory without competition. Of course we have the same 
 problem as for D: non .NET applications will fight for memory with .NET 
 but the idea is still attractive.
 
 Could we have the D GC as a shared lib among all D apps?
There is another huge reason why this is a good idea: If the D garbage collector is shared among all running D applications, then those applications can toss memory to each other without any concern for which created it. (ordinarily, the DLL that allocated a hunk of memory must deallocate it)
 Can we have some kind of GC tuning like "do not use more than x amount 
 of memory" or at least have an idea of what the gc could free in order 
 to decide when to call a xxxxxCollect function? Looks like we're out of 
 control on GC, aren't we?
D does have a delete operator and auto classes, which tell the language that you're really sure you're done with something. (of course, both have the same problem that C++ has, but that's okay; they're opt-in constructs, instead of opt-out) -- andy
Apr 28 2004
prev sibling next sibling parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
FlyTox wrote:

 Coming from C++, I'm not a big fan of Garbage Collection which I 
 consider as an open door to poor program design. Well, may be I'm from 
 the old school. Anyway, I try to evolve and had some thoughts about it.

 As far as I understand, the gc is part of the language and will be 
 nested in the compiled EXE. This basically means that the allocated 
 memory will always expand till gc decides it should collect garbage 
 (or when the user calls a xxxxCollect() function). This means that the 
 D application will work well... to the detriment of all other tasks!


 consider it as a memory server. This should enable all .NET 
 applications to friendly share memory without competition. Of course 
 we have the same problem as for D: non .NET applications will fight 
 for memory with .NET but the idea is still attractive.

 Could we have the D GC as a shared lib among all D apps?

 Can we have some kind of GC tuning like "do not use more than x amount 
 of memory" or at least have an idea of what the gc could free in order 
 to decide when to call a xxxxxCollect function? Looks like we're out 
 of control on GC, aren't we?
I think it would be nice to have a sweat of GC's to choose from for the particular application. For example, in a computer game, I want as many resources as is possible (without causing system problems of course). Many types of GC's should come into D in time as you can write your own GC's in D. -- -Anderson: http://badmama.com.au/~anderson/
Apr 28 2004
parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
J Anderson wrote:

 FlyTox wrote:
 
 Coming from C++, I'm not a big fan of Garbage Collection which I
 consider as an open door to poor program design. Well, may be I'm from
 the old school. Anyway, I try to evolve and had some thoughts about it.

 As far as I understand, the gc is part of the language and will be
 nested in the compiled EXE. This basically means that the allocated
 memory will always expand till gc decides it should collect garbage
 (or when the user calls a xxxxCollect() function). This means that the
 D application will work well... to the detriment of all other tasks!


 consider it as a memory server. This should enable all .NET
 applications to friendly share memory without competition. Of course
 we have the same problem as for D: non .NET applications will fight
 for memory with .NET but the idea is still attractive.

 Could we have the D GC as a shared lib among all D apps?

 Can we have some kind of GC tuning like "do not use more than x amount
 of memory" or at least have an idea of what the gc could free in order
 to decide when to call a xxxxxCollect function? Looks like we're out
 of control on GC, aren't we?
I think it would be nice to have a sweat of GC's to choose from for the particular application. For example, in a computer game, I want as many resources as is possible (without causing system problems of course). Many types of GC's should come into D in time as you can write your own GC's in D.
I actually prefer one good GC with options. I believe Boehm GC has all the options you want. And if it doesn't, it would probably still be far easier to improve that one than to write a new one from scratch.
Apr 28 2004
parent reply J Anderson <REMOVEanderson badmama.com.au> writes:
Norbert Nemec wrote:

J Anderson wrote:
  

I think it would be nice to have a sweat of GC's to choose from for the
particular application.  For example, in a computer game, I want as many
resources as is possible (without causing system problems of course).
Many types of GC's should come into D in time as you can write your own
GC's in D.
    
I actually prefer one good GC with options. I believe Boehm GC has all the options you want. And if it doesn't, it would probably still be far easier to improve that one than to write a new one from scratch.
Humm. I would be nice to have Boehm ported to D (but of course using D's nicer GC syntax). The advantage of having separate GC's is that your not putting all your eggs in one basket. Different factions can develop/compete with there own flavours of GC. GC's could even be developed specially to take advantage of particular systems. -- -Anderson: http://badmama.com.au/~anderson/
Apr 28 2004
next sibling parent reply Bastiaan Veelo <Bastiaan.N.Veelo ntnu.no> writes:
J Anderson wrote:
 Humm.  I would be nice to have Boehm ported to D (but of course using 
 D's nicer GC syntax).
 
Without having the slightest clue of what I am talking about, at least in the latest release of gdc there seems to be a directory called boehm-gc in phobos, apparently imported from the Java support of gcc. Seems there is no need for porting. Bastiaan.
Apr 28 2004
parent reply Stephan Wienczny <wienczny web.de> writes:
Bastiaan Veelo wrote:
 Without having the slightest clue of what I am talking about, at least 
 in the latest release of gdc there seems to be a directory called 
 boehm-gc in phobos, apparently imported from the Java support of gcc.
 
 Seems there is no need for porting.
 
 Bastiaan.
 
He wants to have the garbage collector written in D. You are already able to use the C interface to boehm-gc but a D version would be better. Stephan
Apr 28 2004
parent reply resistor AT mac DOT com <resistor_member pathlink.com> writes:
No, he means that boehm-gc is actually part of the compiler in GDC.  I don't
know exactly how he's 
using it, but he lifted it out of the Java GC system that GCJ uses.

Owen

In article <c6p71c$1dtn$1 digitaldaemon.com>, Stephan Wienczny says...
Bastiaan Veelo wrote:
 Without having the slightest clue of what I am talking about, at least 
 in the latest release of gdc there seems to be a directory called 
 boehm-gc in phobos, apparently imported from the Java support of gcc.
 
 Seems there is no need for porting.
 
 Bastiaan.
 
He wants to have the garbage collector written in D. You are already able to use the C interface to boehm-gc but a D version would be better. Stephan
Apr 28 2004
next sibling parent Stephan Wienczny <wienczny web.de> writes:
He uses some binding sources (d_os_dep.c and d_init.c)
This could be a good idea, but then you should rewrite it in D. It looks 
better. By the way I did not know boehm-gc was incrementel...

resistor AT mac DOT com wrote:

 No, he means that boehm-gc is actually part of the compiler in GDC.  I don't
 know exactly how he's 
 using it, but he lifted it out of the Java GC system that GCJ uses.
 
 Owen
 
Apr 28 2004
prev sibling parent David Friedman <d3rdclsmail earthlink.net> writes:
resistor AT mac DOT com wrote:
 No, he means that boehm-gc is actually part of the compiler in GDC.  I don't
 know exactly how he's 
 using it, but he lifted it out of the Java GC system that GCJ uses.
 
 Owen
 
 In article <c6p71c$1dtn$1 digitaldaemon.com>, Stephan Wienczny says...
 
Bastiaan Veelo wrote:

Without having the slightest clue of what I am talking about, at least 
in the latest release of gdc there seems to be a directory called 
boehm-gc in phobos, apparently imported from the Java support of gcc.

Seems there is no need for porting.

Bastiaan.
He wants to have the garbage collector written in D. You are already able to use the C interface to boehm-gc but a D version would be better. Stephan
I am only borrowing a few pieces of boehm-gc to handle the platform-dependent tasks of finding the data segment and stack extents. The actual memory management code is not used. I should probably rewrite the relevant parts in D or just make stripped-down C version, but I need to study the code some more. David
Apr 28 2004
prev sibling parent reply Norbert Nemec <Norbert.Nemec gmx.de> writes:
J Anderson wrote:
 Humm.  I would be nice to have Boehm ported to D (but of course using
 D's nicer GC syntax).
That would be a mostly ideological difference. The Boehm GC has a clean interface. Why should anyone care whether it is C or D internally? Of course, one might be able write a specialized GC that is taylored for the needs of D, and that person would probably just use D for writing it. But just rewriting it to have it in D is an extremely low priority task...
Apr 29 2004
parent "Walter" <newshound digitalmars.com> writes:
"Norbert Nemec" <Norbert.Nemec gmx.de> wrote in message
news:c6qosf$s8c$2 digitaldaemon.com...
 J Anderson wrote:
 Humm.  I would be nice to have Boehm ported to D (but of course using
 D's nicer GC syntax).
That would be a mostly ideological difference. The Boehm GC has a clean interface. Why should anyone care whether it is C or D internally? Of course, one might be able write a specialized GC that is taylored for
the
 needs of D, and that person would probably just use D for writing it. But
 just rewriting it to have it in D is an extremely low priority task...
I agree. D has the ability to link up directly to C code for the purpose of avoiding the necessity to translate working C code to D. The Boehm GC is complicated enough that doing the conversion will: 1) likely introduce bugs into working, debugged code 2) make it real hard to merge improved C versions of the collector into the D version But there are exceptions to this - the D gc implementation was originally written in C++. I translated it into D in order to prove that system level code could be written in D. I translated Empire from C++ into D in order to help expand the user base of D by providing tweakable source to a popular game. (Interestingly, I've written versions of Empire in Basic, Fortran, PDP-11 assembler, C, C++ and now D. The Basic version was the only one that didn't work.)
May 09 2004
prev sibling next sibling parent reply FlyTox <rox271 hotmail.com> writes:
Another aspect of the GC is we do not master objects dtor execution 
time, do we?
It is a common habit in C++ to drop a temporary object in a particular 
function just to be sure the ctor and dtor would be exectuted at the 
beginning and at the end od the function scope.
A good and simple example if the MFC CWaitCursor class. This class 
selects the wait cursor in the ctor and restores the initial cursor in 
the dtor.

C++ example
void myTimeConsumingFunction()
{
CWaitCurscor wc;
/*
....
*/
}

As far as I understand the D equivalent would be:

void myTimeConsumingFunction()
{
CWaitCurscor wc;
wc = new CWaitCursor;
/*
....
*/

/* just to overcome the GC ctor execution time ??? */
delete wc;
/* or wc=null; ?? */
}

The D version doesn't seem very good. Is this the right thing to do?
I'm not sure I quite understand this aspect.

FlyTox wrote:
 Coming from C++, I'm not a big fan of Garbage Collection which I 
 consider as an open door to poor program design. Well, may be I'm from 
 the old school. Anyway, I try to evolve and had some thoughts about it.
 
 As far as I understand, the gc is part of the language and will be 
 nested in the compiled EXE. This basically means that the allocated 
 memory will always expand till gc decides it should collect garbage (or 
 when the user calls a xxxxCollect() function). This means that the D 
 application will work well... to the detriment of all other tasks!
 

 consider it as a memory server. This should enable all .NET applications 
 to friendly share memory without competition. Of course we have the same 
 problem as for D: non .NET applications will fight for memory with .NET 
 but the idea is still attractive.
 
 Could we have the D GC as a shared lib among all D apps?
 
 Can we have some kind of GC tuning like "do not use more than x amount 
 of memory" or at least have an idea of what the gc could free in order 
 to decide when to call a xxxxxCollect function? Looks like we're out of 
 control on GC, aren't we?
 
Apr 28 2004
next sibling parent "Unknown W. Brackets" <unknown at.simplemachines.dot.org> writes:
FlyTox wrote:
 Another aspect of the GC is we do not master objects dtor execution 
 time, do we?
 It is a common habit in C++ to drop a temporary object in a particular 
 function just to be sure the ctor and dtor would be exectuted at the 
 beginning and at the end od the function scope.
 A good and simple example if the MFC CWaitCursor class. This class 
 selects the wait cursor in the ctor and restores the initial cursor in 
 the dtor.
 
 C++ example
 void myTimeConsumingFunction()
 {
 CWaitCurscor wc;
 /*
 ....
 */
 }
 
 As far as I understand the D equivalent would be:
 
 void myTimeConsumingFunction()
 {
 CWaitCurscor wc;
 wc = new CWaitCursor;
 /*
 ....
 */
 
 /* just to overcome the GC ctor execution time ??? */
 delete wc;
 /* or wc=null; ?? */
 }
 
 The D version doesn't seem very good. Is this the right thing to do?
 I'm not sure I quite understand this aspect.
 
 FlyTox wrote:
 
 Coming from C++, I'm not a big fan of Garbage Collection which I 
 consider as an open door to poor program design. Well, may be I'm from 
 the old school. Anyway, I try to evolve and had some thoughts about it.

 As far as I understand, the gc is part of the language and will be 
 nested in the compiled EXE. This basically means that the allocated 
 memory will always expand till gc decides it should collect garbage 
 (or when the user calls a xxxxCollect() function). This means that the 
 D application will work well... to the detriment of all other tasks!


 consider it as a memory server. This should enable all .NET 
 applications to friendly share memory without competition. Of course 
 we have the same problem as for D: non .NET applications will fight 
 for memory with .NET but the idea is still attractive.

 Could we have the D GC as a shared lib among all D apps?

 Can we have some kind of GC tuning like "do not use more than x amount 
 of memory" or at least have an idea of what the gc could free in order 
 to decide when to call a xxxxxCollect function? Looks like we're out 
 of control on GC, aren't we?
I would suggest that this is misuing the destructor. It should, again I only suggest, be a method you call to restore the cursor... While being able to know it will just "flip back" later is okay, it's cleaner and more understandable if there's a method. If I were reading code that utilized a class that did this, it would take me longer to realize that this is what's happening without the method call. D wasn't made, to my understanding, to make programming as easy as buttering toast. For that, you can use Visual Basic if you really must. No, D seems to have the right goal in mind - making clean, readable, logical, and predictable code that is better for those in open source who actually work to make their code readable and understandable by others - people like me. But, this is just my opinion. -[Unknown]
Apr 28 2004
prev sibling parent reply Andy Friesen <andy ikagames.com> writes:
FlyTox wrote:
 Another aspect of the GC is we do not master objects dtor execution 
 time, do we?
 [ ... ]
 The D version doesn't seem very good. Is this the right thing to do?
 I'm not sure I quite understand this aspect.
auto references achieve exactly what you're after. void myTimeConsumingFunction() { auto WaitCursor wc = new WaitCursor(); ... } wc is implicitly deleted when the function scope ends. It's more or less directly equivalent to this: void myTimeConsumingFunction() { WaitCursor wc = new WaitCursor(); try { ... } finally { delete wc; } } -- andy
Apr 29 2004
parent FlyTox <rox271 hotmail.com> writes:
Very good. It works!
I guess, I will need to spend more time on the doc :-)

Thanks Andy.

Andy Friesen wrote:
 FlyTox wrote:
 
 Another aspect of the GC is we do not master objects dtor execution 
 time, do we?
 [ ... ]
 The D version doesn't seem very good. Is this the right thing to do?
 I'm not sure I quite understand this aspect.
auto references achieve exactly what you're after. void myTimeConsumingFunction() { auto WaitCursor wc = new WaitCursor(); ... } wc is implicitly deleted when the function scope ends. It's more or less directly equivalent to this: void myTimeConsumingFunction() { WaitCursor wc = new WaitCursor(); try { ... } finally { delete wc; } } -- andy
Apr 29 2004
prev sibling next sibling parent renox <renosky free.fr> writes:
FlyTox wrote:
 Coming from C++, I'm not a big fan of Garbage Collection which I 
 consider as an open door to poor program design.
Some C++ program such as Mozilla/FF suck at memory handling too, GC is not a magic wand, it's just easier and you can still do manual allocation if you want. [cut]
 As far as I understand, the gc is part of the language and will be 
 nested in the compiled EXE. This basically means that the allocated 
 memory will always expand till gc decides it should collect garbage
 (or when the user calls a xxxxCollect() function). This means that
 the D application will work well... to the detriment of all other
 tasks!
Depends of memory usage of course. But this means also that in some case, the GC can have higher performance than malloc/free as it frees several objects at the same time (with the price to pay of a bigger pause during the collection which may or may not be a problem for the application, of course some GC avoid the pause but have lower performance.) [cut]
 Could we have the D GC as a shared lib among all D apps?
No, because of memory protection as other have pointed.
 Can we have some kind of GC tuning like "do not use more than x
 amount of memory" or at least have an idea of what the gc could free
 in order to decide when to call a xxxxxCollect function? Looks like
 we're out of control on GC, aren't we?
Some GC for Java has such knobs but it's quite annoying to have to handtune the GC memory usage. There has been some research (see the PageLevel Cooperative Garbage Collection paper) to have the GC cooperate with the virtual memory manager of the OS to avoid having the GC overflow the available memory size which would create swapping problem. The big downside of course is that the kernel must be modified to support this but for Linux or *BSD this is not so far fetched, if for example Parrot or Harmony implement these kind of GC in the future.. Regards, RenoX
Jun 02 2006
prev sibling parent "AJ" <aj nospam.net> writes:
FlyTox wrote:

 Coming from C++, I'm not a big fan of Garbage Collection which I
 consider as an open door to poor program design.
Until the hardware is not your closest friend, GC is just a pot of boiling oil. To include it as a fundamental in a language, IMO, it to create a stillborn language. (I am sooooo bad!!!)
Nov 21 2009