www.digitalmars.com         C & C++   DMDScript  

D - Garbage Collection

reply Warren Baird <warren 127.0.0.1> writes:
I apologize if this has been addressed somewhere I haven't read 
everything on this news server yet...

In general I like the sound of D a lot - One of the things I liked about 
Java was getting rid of a lot of the annoying and troublesome features 
of C++.

However, I come from a background of doing scientific visualization 
software for computational fluid dynamics, and one of the things that 
made Java totally useless was the garbage collection model it used. When 
you are tossing about 100Mb arrays of data - you really need a way to 
say "I'm done with this - deallocate it NOW".

Java's GC tends (or tended, I haven't looked at it in a year or so) to 
only run when things are idle, so it was really easy to run out of 
memory if you were doing a lot of intensive work.

I'm not saying that GC is a bad thing, just that it would be great if 
there was a way of convince the GC to run at specified intervals, to 
explicitly indicate that it should run NOW, or to explicitly deallocate 
a specified block of memory...

Are there any plans to provide this kind of functionality?

Warren
Aug 16 2001
next sibling parent "Angus Graham" <agraham_d agraham.ca> writes:
"Warren Baird" <warren 127.0.0.1> wrote in message
 <snip> When
 you are tossing about 100Mb arrays of data - you really need a way to
 say "I'm done with this - deallocate it NOW".
http://www.digitalmars.com/d/class.html: "The program can explicitly inform the garbage collector that an object is no longer referred to (with the delete expression), and then the garbage collector calls the destructor immediately, and adds the object's memory to the free storage."
Aug 16 2001
prev sibling next sibling parent reply "Walter" <walter digitalmars.com> writes:
Yes, you are absolutely right. D has a "delete" operator, that explicitly
tells the gc that this memory can be free'd now and there's no need to wait
until the next gc cycle. It'd be up to the programmer, however, to guarantee
that there'd be no dangling references to the memory, so it isn't perfect
:-(
 -Walter

"Warren Baird" <warren 127.0.0.1> wrote in message
news:3B7C306D.1060907 127.0.0.1...
 I apologize if this has been addressed somewhere I haven't read
 everything on this news server yet...

 In general I like the sound of D a lot - One of the things I liked about
 Java was getting rid of a lot of the annoying and troublesome features
 of C++.

 However, I come from a background of doing scientific visualization
 software for computational fluid dynamics, and one of the things that
 made Java totally useless was the garbage collection model it used. When
 you are tossing about 100Mb arrays of data - you really need a way to
 say "I'm done with this - deallocate it NOW".

 Java's GC tends (or tended, I haven't looked at it in a year or so) to
 only run when things are idle, so it was really easy to run out of
 memory if you were doing a lot of intensive work.

 I'm not saying that GC is a bad thing, just that it would be great if
 there was a way of convince the GC to run at specified intervals, to
 explicitly indicate that it should run NOW, or to explicitly deallocate
 a specified block of memory...

 Are there any plans to provide this kind of functionality?

 Warren
Aug 16 2001
next sibling parent reply Christophe de Dinechin <descubes earthlink.net> writes:
Walter wrote:

 Yes, you are absolutely right. D has a "delete" operator, that explicitly
 tells the gc that this memory can be free'd now and there's no need to wait
 until the next gc cycle. It'd be up to the programmer, however, to guarantee
 that there'd be no dangling references to the memory, so it isn't perfect
 :-(
  -Walter
This disrupts the whole safety given by the GC. Let's assume for a while that your GC has some way of telling: this is the last reference to the object (easy in reference-counting schemes, more difficult otherwise). Then doing delete when you are not the last reference could be flagged as a programming error. Alternatively, delete could simply be a mean to remove a reference to the object. If there are no other references, then you are guaranteed that the object is freed immediately. If there are other references, then the object remains in memory, but your own reference is removed. Both approaches are safer, which is a whole point of a GC. Again, a GC without safety is close to worthless. It's a bit like saying: I have a GC, but it crashes if I have a null pointer in my program. Christophe
Aug 17 2001
parent reply "Will Hartung" <willh msoft.com> writes:
"Christophe de Dinechin" <descubes earthlink.net> wrote in message
news:3B7D3073.950BD33 earthlink.net...
 Walter wrote:

 Yes, you are absolutely right. D has a "delete" operator, that
explicitly
 tells the gc that this memory can be free'd now and there's no need to
wait
 until the next gc cycle. It'd be up to the programmer, however, to
guarantee
 that there'd be no dangling references to the memory, so it isn't
perfect
 :-(
  -Walter
This disrupts the whole safety given by the GC.
*snip* Yes, this is like "Look, we have the potential slowdowns of GC plus the instability of dangling memory references! Whee!" Sounds like the worse of both worlds to me. Now, in theory, as Christopher mentioned,the delete can be a "smart" delete, but assigning an object reference to NULL could be "just as smart". But I think adding this capability gives false hope to the users. "How come my code is slow, I don't use GC at all, just 'delete'?" Why, it's slow for the same reason it would be slow in other non-GC'd languages. Memory management is a lot more than throwing "new" and "delete" around willy nilly. Garbage Collectors add the ability to recompact and reorganize memory as well. So, I think that you may want to allow the program to be more converstational with the GC so that you might be able to do things like what alloca (The GNU stack allocator) (i.e. I'm about to allocate a bunch of real short term stuff that you can kill en masse when I'm done...). Another complaint most people have is that they've grown accustomed to using C++ destructors, which in many GC'd languages may never fire. In Java that tends to be done in 'finally' blocks. /Will
Aug 17 2001
parent Russell Bornschlegel <kaleja estarcion.com> writes:
So, how about:

  delete foo;  // decrement reference count and possibly delete

  delete;      // or "compact;" -- force GC to run

...and it's up to the programmer to make sure there's only one 
reference if he wants the delete to really work. That way 
things are safe, but the programmer can still force a release 
or a GC pass.

-Russell B
Aug 17 2001
prev sibling parent reply Russ Lewis <russ deming-os.org> writes:
Walter wrote:

 Yes, you are absolutely right. D has a "delete" operator, that explicitly
 tells the gc that this memory can be free'd now and there's no need to wait
 until the next gc cycle. It'd be up to the programmer, however, to guarantee
 that there'd be no dangling references to the memory, so it isn't perfect
 :-(
How about this as a safer syntax: gc <statement> EXAMPLE: int[] buf = new int[10000]; ... gc buf = foo(); Any statement prefixed with gc will cause immediate garbage collection on any references lost as part of that statement. That is, the reference to the 10,000 member int array is lost in the last statement, so it is immediately garbage-collected when the statement completes (or, perhaps, immediately when the last reference is lost). My general inclination is to say that this statement shouldn't affect the performance of foo(), though that's not absolute. Another idea would be to use delete, but have it only valid on lvals. It would set the value of that variable to null, then check to see if any references remain. If so, no garbage collection happens. If not, garbage collection happens immediately. int[123456] buf; int *ptr = buf; delete buf; /* buf is set to null here, but the buffer is NOT cleaned up */ delete ptr; /* ptr is set to null, and its reference is immediately cleaned up */ Of course, I'm not 100% sure that any such syntax is needed at all. Perhaps it's enough to just be able to force a garbace collection. Just force one whenever you delete something that you think might be unnaturally large. Thoughts?
Sep 06 2001
parent reply "Walter" <walter digitalmars.com> writes:
It would not be too practical to scan all of memory for references to just
one object - you might as well do a full gc.

The purpose of a specific delete is:

1) Get the destructor for the object run right now
2) Aid the GC

Russ Lewis wrote in message <3B97E8EB.1B363F4F deming-os.org>...
Walter wrote:

 Yes, you are absolutely right. D has a "delete" operator, that explicitly
 tells the gc that this memory can be free'd now and there's no need to
wait
 until the next gc cycle. It'd be up to the programmer, however, to
guarantee
 that there'd be no dangling references to the memory, so it isn't perfect
 :-(
How about this as a safer syntax: gc <statement> EXAMPLE: int[] buf = new int[10000]; ... gc buf = foo(); Any statement prefixed with gc will cause immediate garbage collection on
any
references lost as part of that statement.  That is, the reference to the
10,000
member int array is lost in the last statement, so it is immediately
garbage-collected when the statement completes (or, perhaps, immediately
when
the last reference is lost).  My general inclination is to say that this
statement shouldn't affect the performance of foo(), though that's not
absolute.
Another idea would be to use delete, but have it only valid on lvals.  It
would
set the value of that variable to null, then check to see if any references
remain.  If so, no garbage collection happens.  If not, garbage collection
happens immediately.

int[123456] buf;
int *ptr = buf;
delete buf; /* buf is set to null here, but the buffer is NOT cleaned up */
delete ptr; /* ptr is set to null, and its reference is immediately cleaned
up
*/

Of course, I'm not 100% sure that any such syntax is needed at all.
Perhaps
it's enough to just be able to force a garbace collection.  Just force one
whenever you delete something that you think might be unnaturally large.

Thoughts?
Sep 06 2001
next sibling parent reply Russ Lewis <russ deming-os.org> writes:
Walter wrote:

 It would not be too practical to scan all of memory for references to just
 one object - you might as well do a full gc.

 The purpose of a specific delete is:

 1) Get the destructor for the object run right now
 2) Aid the GC
In a subjective sense, how expensive is the GC routine, and how much backlog is likely to happen? I'm not convinced that it's a bad thing to just force a complete run of the GC when you need to guarantee cleanup right now.
Sep 07 2001
next sibling parent Axel Kittenberger <axel dtone.org> writes:
At least the a wide spread implementation of the  boehm-demers-weiser gc 
implementation 
( http://www.hpl.hp.com/personal/Hans_Boehm/gc/ )

allows you also to specifically delete an object when you want it with 
GC_free()

It allows you also to register "finalizers" with objects that are run when 
they are destructed.

- Axel
Sep 07 2001
prev sibling parent "Walter" <walter digitalmars.com> writes:
Russ Lewis wrote in message <3B9878CF.F6931E34 deming-os.org>...
Walter wrote:
 It would not be too practical to scan all of memory for references to
just
 one object - you might as well do a full gc.
 The purpose of a specific delete is:
 1) Get the destructor for the object run right now
 2) Aid the GC
In a subjective sense, how expensive is the GC routine, and how much
backlog is
likely to happen?  I'm not convinced that it's a bad thing to just force a
complete run of the GC when you need to guarantee cleanup right now.
It's a regular GC - nothing special. The first version will be adequate, but not great. Later versions will be generational, and much less overhead. D is friendly to a GC, so it should work much better than one for C++.
Sep 07 2001
prev sibling parent reply Russ Lewis <russ deming-os.org> writes:
Walter wrote:

 It would not be too practical to scan all of memory for references to just
 one object - you might as well do a full gc.

 The purpose of a specific delete is:

 1) Get the destructor for the object run right now
 2) Aid the GC
I have no experience with (implementations of) GC, so I don't know how it works. Maybe we could have a few details? My assumptions were based on the thought that the algorithm would work something like this: * Maintain a reference count on each GC-able object. * When assigning a pointer (or array), first take the old value and put it in a list of objects to consider for the GC. You normally would only do this when the reference count goes to 0, but you have to consider circular references that are not accessible from the main tree. * When the GC runs, it just iterates down the list of recently released objects; when it finds one with 0 references (or with only circular references), it automatically calls the destructor. However, you talk about "scanning" which sounds like a different algorithm altogether, so I'm stumped... I'm hoping you don't mean that you're scanning through all of memory for pointers to an object (eek!)
Sep 07 2001
parent "Walter" <walter digitalmars.com> writes:
Hans Boehm has some great papers on how GC works. People write books about
it - more than possible in a simple posting! You can find out a lot by
Google'ing on "garbage collection". -Walter

Russ Lewis wrote in message <3B98E141.2AB2AEEC deming-os.org>...
Walter wrote:

 It would not be too practical to scan all of memory for references to
just
 one object - you might as well do a full gc.

 The purpose of a specific delete is:

 1) Get the destructor for the object run right now
 2) Aid the GC
I have no experience with (implementations of) GC, so I don't know how it works. Maybe we could have a few details? My assumptions were based on
the
thought that the algorithm would work something like this:

* Maintain a reference count on each GC-able object.
* When assigning a pointer (or array), first take the old value and put it
in a
list of objects to consider for the GC.  You normally would only do this
when
the reference count goes to 0, but you have to consider circular references
that
are not accessible from the main tree.
* When the GC runs, it just iterates down the list of recently released
objects;
when it finds one with 0 references (or with only circular references), it
automatically calls the destructor.

However, you talk about "scanning" which sounds like a different algorithm
altogether, so I'm stumped...  I'm hoping you don't mean that you're
scanning
through all of memory for pointers to an object (eek!)
Sep 07 2001
prev sibling next sibling parent reply "Michael Gaskins" <mbgaski clemson.edu> writes:
Actually I think it would be be very benificial if we could just have a
function call (forgive me if it's called something different in D, I've just
started looking into the project) to force the garbage collector to run.
This way the programmer could periodically "clean out the memory" at
oppurtune times in the programs execution cycle.  The automatic GC routines
should still be implemented, but it would be nice to also be able to
initiate the process manually.

Michael Gaskins
Computer Science Dept, Clemson University
Undergraduate (Junior)

"Warren Baird" <warren 127.0.0.1> wrote in message
news:3B7C306D.1060907 127.0.0.1...
 I apologize if this has been addressed somewhere I haven't read
 everything on this news server yet...

 In general I like the sound of D a lot - One of the things I liked about
 Java was getting rid of a lot of the annoying and troublesome features
 of C++.

 However, I come from a background of doing scientific visualization
 software for computational fluid dynamics, and one of the things that
 made Java totally useless was the garbage collection model it used. When
 you are tossing about 100Mb arrays of data - you really need a way to
 say "I'm done with this - deallocate it NOW".

 Java's GC tends (or tended, I haven't looked at it in a year or so) to
 only run when things are idle, so it was really easy to run out of
 memory if you were doing a lot of intensive work.

 I'm not saying that GC is a bad thing, just that it would be great if
 there was a way of convince the GC to run at specified intervals, to
 explicitly indicate that it should run NOW, or to explicitly deallocate
 a specified block of memory...

 Are there any plans to provide this kind of functionality?

 Warren
Aug 16 2001
next sibling parent "Robert W. Cunningham" <rwc_2001 yahoo.com> writes:
Michael Gaskins wrote:

 Actually I think it would be be very benificial if we could just have a
 function call (forgive me if it's called something different in D, I've just
 started looking into the project) to force the garbage collector to run.
 This way the programmer could periodically "clean out the memory" at
 oppurtune times in the programs execution cycle.  The automatic GC routines
 should still be implemented, but it would be nice to also be able to
 initiate the process manually.
How about using "delete" alone, without any following parameter, for this purpose? -BobC
Aug 16 2001
prev sibling next sibling parent "Sheldon Simms" <sheldon semanticedge.com> writes:
Im Artikel <9li8re$npb$1 digitaldaemon.com> schrieb "Michael Gaskins"
<mbgaski clemson.edu>:

 Actually I think it would be be very benificial if we could just have a
 function call (forgive me if it's called something different in D, I've
 just started looking into the project) to force the garbage collector to
 run. This way the programmer could periodically "clean out the memory"
 at oppurtune times in the programs execution cycle.  The automatic GC
 routines should still be implemented, but it would be nice to also be
 able to initiate the process manually.
I find this to be a much better solution that allowing the programmer to throw away individual memory blocks regardless of who else might be referencing them. -- Sheldon Simms / sheldon semanticedge.com
Aug 17 2001
prev sibling next sibling parent reply "Kent Sandvik" <sandvik excitehome.net> writes:
"Michael Gaskins" <mbgaski clemson.edu> wrote in message
news:9li8re$npb$1 digitaldaemon.com...
 Actually I think it would be be very benificial if we could just have a
 function call (forgive me if it's called something different in D, I've
just
 started looking into the project) to force the garbage collector to run.
 This way the programmer could periodically "clean out the memory" at
 oppurtune times in the programs execution cycle.  The automatic GC
routines
 should still be implemented, but it would be nice to also be able to
 initiate the process manually.
For real-time intensive tasks, if one could also enforce that the GC should not run for a certain section, maybe that would also help out. It's true it would cause performance problems later, but it gives more power in case there's a section in the code where GC would cause problems, especially with timing issues and such. --Kent
Aug 17 2001
parent reply Roland <rv ronetech.com> writes:
unfortunately specifications says:

Who D is Not For
     Real time programming where latency must be guaranteed.
...
for me it is not a goo news as D seems nice Roland Kent Sandvik a écrit :
 "Michael Gaskins" <mbgaski clemson.edu> wrote in message
 news:9li8re$npb$1 digitaldaemon.com...
 Actually I think it would be be very benificial if we could just have a
 function call (forgive me if it's called something different in D, I've
just
 started looking into the project) to force the garbage collector to run.
 This way the programmer could periodically "clean out the memory" at
 oppurtune times in the programs execution cycle.  The automatic GC
routines
 should still be implemented, but it would be nice to also be able to
 initiate the process manually.
For real-time intensive tasks, if one could also enforce that the GC should not run for a certain section, maybe that would also help out. It's true it would cause performance problems later, but it gives more power in case there's a section in the code where GC would cause problems, especially with timing issues and such. --Kent
Aug 25 2001
parent reply Dan Hursh <hursh infonet.isl.net> writes:
Roland wrote:
 
 unfortunately specifications says:
 
Who D is Not For
     Real time programming where latency must be guaranteed.
...
for me it is not a goo news as D seems nice Roland
Along these lines, is the garbage collector the only reason D cannot be used for real-time? It seems to me that depending on how much control the programmer can get to how GC behaves (I thought Walter mentioned a module to tune it) that D could still be reasonable for multi-media and other non critical apps that often use real-time scheduling right? Given the ability to use delete (carefully!) D might not be so bad. There are still temp values from expressions that need cleaning. Also, an I understand GC right to believe that, for the most part, you can stop a GC sweep part way through and continue it later? Does it have to be atomic? Dan
Aug 25 2001
parent Florian Weimer <Florian.Weimer RUS.Uni-Stuttgart.DE> writes:
Dan Hursh <hursh infonet.isl.net> writes:

 	Also, an I understand GC right to believe that, for the most part, you
 can stop a GC sweep part way through and continue it later?  Does it
 have to be atomic?
No, it doesn't. If you're using a mark-and-sweep collector, the sweep phase is rather uncritical anyway, sweeping can be interleaved with normal execution. For the mark phase, things are more complicated, of course, but there are certainly more options than just halting the mutator. AFAIK, recent versions of the Boehm collector (a probabilistic collector for C) support concurrent marking on some architectures. -- Florian Weimer Florian.Weimer RUS.Uni-Stuttgart.DE University of Stuttgart http://cert.uni-stuttgart.de/ RUS-CERT +49-711-685-5973/fax +49-711-685-5898
Aug 25 2001
prev sibling parent reply "Sean L. Palmer" <spalmer iname.com> writes:
I wonder if it would be possible to have a compiler switch choose between GC
methods... scanning or refcounting.  In any program, one may be more
efficient than the other... probably open up a whole can of worms when
linking to separately-compiled libs though.  I'm not even sure the compiler
could do the refcounting method properly since it'd have to detect taking
the address of a member of a class and inc the class refcount... <shiver>

But the thing about GC that scares the bejeezus about game programmers is
that GC might happen in the middle of your game, causing a 3-second delay in
the gameplay.  At least we should be able to make sure the GC does *not* run
by, say, not performing any allocations.

Sean

"Michael Gaskins" <mbgaski clemson.edu> wrote in message
news:9li8re$npb$1 digitaldaemon.com...
 Actually I think it would be be very benificial if we could just have a
 function call (forgive me if it's called something different in D, I've
just
 started looking into the project) to force the garbage collector to run.
 This way the programmer could periodically "clean out the memory" at
 oppurtune times in the programs execution cycle.  The automatic GC
routines
 should still be implemented, but it would be nice to also be able to
 initiate the process manually.
Oct 23 2001
parent "Walter" <walter digitalmars.com> writes:
You can temporarilly disable collection.

"Sean L. Palmer" <spalmer iname.com> wrote in message
news:9r3g5r$ba0$1 digitaldaemon.com...
 I wonder if it would be possible to have a compiler switch choose between
GC
 methods... scanning or refcounting.  In any program, one may be more
 efficient than the other... probably open up a whole can of worms when
 linking to separately-compiled libs though.  I'm not even sure the
compiler
 could do the refcounting method properly since it'd have to detect taking
 the address of a member of a class and inc the class refcount... <shiver>

 But the thing about GC that scares the bejeezus about game programmers is
 that GC might happen in the middle of your game, causing a 3-second delay
in
 the gameplay.  At least we should be able to make sure the GC does *not*
run
 by, say, not performing any allocations.

 Sean

 "Michael Gaskins" <mbgaski clemson.edu> wrote in message
 news:9li8re$npb$1 digitaldaemon.com...
 Actually I think it would be be very benificial if we could just have a
 function call (forgive me if it's called something different in D, I've
just
 started looking into the project) to force the garbage collector to run.
 This way the programmer could periodically "clean out the memory" at
 oppurtune times in the programs execution cycle.  The automatic GC
routines
 should still be implemented, but it would be nice to also be able to
 initiate the process manually.
Oct 28 2001
prev sibling parent reply Russ Lewis <russ deming-os.org> writes:
Warren Baird wrote:

 I apologize if this has been addressed somewhere I haven't read
 everything on this news server yet...

 In general I like the sound of D a lot - One of the things I liked about
 Java was getting rid of a lot of the annoying and troublesome features
 of C++.

 However, I come from a background of doing scientific visualization
 software for computational fluid dynamics, and one of the things that
 made Java totally useless was the garbage collection model it used. When
 you are tossing about 100Mb arrays of data - you really need a way to
 say "I'm done with this - deallocate it NOW".

 Java's GC tends (or tended, I haven't looked at it in a year or so) to
 only run when things are idle, so it was really easy to run out of
 memory if you were doing a lot of intensive work.

 I'm not saying that GC is a bad thing, just that it would be great if
 there was a way of convince the GC to run at specified intervals, to
 explicitly indicate that it should run NOW, or to explicitly deallocate
 a specified block of memory...
I agree that it would be good to have some way to tell it "do garbage collection now." The simplest way to deal with things would be to run the garbage collection routine any time that the operator new fails. Inside that operator, if it fails once, it should run the garbage collector and then retry. If it still fails, then throw an out of memory exception. I can see where the programmer might want to schedule garbage collection manually, but I think that this should be as automated a process as possible. Another idea: what about something (perhaps defined at compile time, perhaps a runtime parameter) that would automatically call the garbage collector after a certain number of releases of references?
Aug 17 2001
parent "Walter" <walter digitalmars.com> writes:
I intend for there to be a class in the runtime library called "GC" or some
such through which you can tune the behavior of the garbage collector.
Aug 18 2001