www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - garbage collection in d

reply Daniel Oberhoff <daniel danieloberhoff.de> writes:
Hi,

Since a while I am following this new lanugage around and recently I 
even read one of the books (learn to tango with d) and I must say I 
mostly love it. what I love is that it seems to clean up a lot of the 
mess that c++ is in, somehow brought to the point by the fact that 0x 
has become 1x ( or hex ). c++ is my primary language simply because it 
is the only portable high-level language with some establishment that I 
can easily tie into java, python, .net etc.

mostly is rooted

a) in the chaos that the language still seems to be in in that there is 
no sharp specification and a lot of things changing rapidly

and

b) my reluctance to the dependency on a complex runtime as the one d i 
is bringing at least due to its garbage collector

a) is not really a disadvantage, and may well be an advantage, though I 
have seen at least one project struggling with the lack of documentation

b) worries me a little. I am working towards real time systems with 
tight time and sometimes also tight memory constraints, and a 
conservative stop-the-world collector seems a bit daunting in this 
context. is it reasonable to work without the collector, or are there 
plans to upgrade to a concurrent one. also are there extensive 
performance tests as how badly the collector interrupts real-time 
processing?

keep up the good work, maybe I can contribute sometime, till then I 
will linger a little more. :)
Apr 06 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Daniel Oberhoff:

Just few comments on your post, other people will give you more answers.

 recently I even read one of the books (learn to tango with d)

That book is about D V.1 language, that is feature frozen. Most of the efforts are now on D2 language, that adds and changes several things (like the const system), and it currently uses only an improved version of Phobos.
 a) in the chaos that the language still seems to be in in that there is 
 no sharp specification and a lot of things changing rapidly

D1 is frozen, and D2 has recently stopped its quick development, now it's mostly a work of refining few edges, removing many bugs and implementing few parts missing still but already designed (and improving Phobos, for example with data structures).
 b) worries me a little. I am working towards real time systems with 
 tight time and sometimes also tight memory constraints, and a 
 conservative stop-the-world collector seems a bit daunting in this 
 context.

I think D GC has not being tested in stressfull serious situations yet. I am not expert on this, but I think the soft real time requirement are not the main problem of the D GC, it starts working when you allocate memory, and there are many ways to disable it or avoid it to work in inner loops. It main problem is probably that it's not precise, so it can cause leaks.
is it reasonable to work without the collector,<

There is a way to disable it, and you can even remove it, but then you can't touch several features of the language (associative arrays, array/string copying, growing and concat, you can't use several functions in Phobos because they use GC, and so on). The D language is built around the GC. But in many situations you can keep the GC and use your own manual memory allocation schemes, for example using memory from the C heap. So you can use arenas, memory stacks, free lists, placement new of classes and structs, manual delete, or scope(exit) delete, and so on and on. I have successfully used several of those things in small D programs.
or are there plans to upgrade to a concurrent one.<

Surely sooner or later the D GC will be updated, because it's not advanced, but there are more urgent things to do now, so it may take some years.
 also are there extensive performance tests as how badly the collector
interrupts real-time processing?<

Not even one, as far as I know. Some people have written few small games with D1.
 keep up the good work, maybe I can contribute sometime, till then I 
 will linger a little more. :)

You can start writing some benchmarks to test the things you care more about :-) Bye, bearophile
Apr 06 2010
parent reply Daniel Oberhoff <daniel danieloberhoff.de> writes:
On 2010-04-06 22:47:40 +0200, bearophile <bearophileHUGS lycos.com> said:

 Daniel Oberhoff:
 
 Just few comments on your post, other people will give you more answers.
 
 recently I even read one of the books (learn to tango with d)

That book is about D V.1 language, that is feature frozen. Most of the efforts are now on D2 language, that adds and changes several things (like the const system), and it currently uses only an improved version of Phobos.
 a) in the chaos that the language still seems to be in in that there is
 no sharp specification and a lot of things changing rapidly

D1 is frozen, and D2 has recently stopped its quick development, now it's mostly a work of refining few edges, removing many bugs and implementing few parts missing still but already designed (and improving Phobos, for example with data structures).
 b) worries me a little. I am working towards real time systems with
 tight time and sometimes also tight memory constraints, and a
 conservative stop-the-world collector seems a bit daunting in this
 context.

I think D GC has not being tested in stressfull serious situations yet. I am not expert on this, but I think the soft real time requirement are not the main problem of the D GC, it starts working when you allocate memory, and there are many ways to disable it or avoid it to work in inner loops. It main problem is probably that it's not precise, so it can cause leaks.

oh :). so what are the biggest projects done with d so far ( one and two ) ?
 
 keep up the good work, maybe I can contribute sometime, till then I
 will linger a little more. :)

You can start writing some benchmarks to test the things you care more about :-)

I will see if I get round to it. I may have a go at trying to get a dsel for array expressions going. I have been doing some of that in c++ with boost proto, but its hard, and I suppose in d it may be easier, because a lot of the metaprogramming seems to be more straight forward. wondering if one could convince the proto coders to port it to d :).
 
 Bye,
 bearophile

Apr 07 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Daniel Oberhoff:

 oh :). so what are the biggest projects done with d so far ( one and two ) ?

I leave this question to other people.
 I may have a go at trying to get a 
 dsel for array expressions going. I have been doing some of that in c++ 
 with boost proto, but its hard, and I suppose in d it may be easier, 
 because a lot of the metaprogramming seems to be more straight forward.

Recently another person has tried to implement expression templates in D2 for numerical coding, but I think he has found a small problem, that probably will be fixed. Time ago the main D developer has thought about adding some kind of AST macros to D2, that can probably simplify a lot the implementation of expression optimizations. I don't know if it will be ever be added, they introduce a good amount of complexity to the language. And it's not for D2 anyway, because there is enough eggplant roasting on the D2 fire now, and it's not cooked yet on both sides, so in the meantime you can try to use expression templates. Bye, bearophile
Apr 07 2010
prev sibling parent Eric Poggel <dnewsgroup yage3d.net> writes:
On 4/7/2010 5:04 PM, Daniel Oberhoff wrote:
 oh :). so what are the biggest projects done with d so far ( one and two
 ) ?

D1 and is about 15kloc.
Apr 08 2010
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Daniel Oberhoff Wrote:
 
 b) my reluctance to the dependency on a complex runtime as the one d i 
 is bringing at least due to its garbage collector
 
 b) worries me a little. I am working towards real time systems with 
 tight time and sometimes also tight memory constraints, and a 
 conservative stop-the-world collector seems a bit daunting in this 
 context. is it reasonable to work without the collector, or are there 
 plans to upgrade to a concurrent one. also are there extensive 
 performance tests as how badly the collector interrupts real-time 
 processing?

It's still possible to build druntime with a custom GC. You can even have a "GC" that simply calls malloc/free if you avoid coding that relies on implicit collection of discarded memory. See gc_stub for an example. As for better GC implementations, there are a bunch of options, but I don't know that we can go so far as an incremental collector ala Java. That D can call C code causes problems there.
Apr 07 2010
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 4/7/10 20:40, Sean Kelly wrote:
 Daniel Oberhoff Wrote:
 b) my reluctance to the dependency on a complex runtime as the one d i
 is bringing at least due to its garbage collector

 b) worries me a little. I am working towards real time systems with
 tight time and sometimes also tight memory constraints, and a
 conservative stop-the-world collector seems a bit daunting in this
 context. is it reasonable to work without the collector, or are there
 plans to upgrade to a concurrent one. also are there extensive
 performance tests as how badly the collector interrupts real-time
 processing?

It's still possible to build druntime with a custom GC. You can even have a "GC" that simply calls malloc/free if you avoid coding that relies on implicit collection of discarded memory. See gc_stub for an example. As for better GC implementations, there are a bunch of options, but I don't know that we can go so far as an incremental collector ala Java. That D can call C code causes problems there.

Maybe something like the AutoZone collector on Mac OS X: http://www.opensource.apple.com/source/autozone/autozone-77.1/README.html?f=text "AutoZone is a scanning, conservative, generational, multi-threaded garbage collector." "... the implementation is language agnostic". "The AutoZone collector is implemented in C++ and is designed to work in a runtime where some or most of the application's memory may be managed by mechanisms other than the collector."
Apr 07 2010
next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
Robert Jacques Wrote:
 
 Via reddit:
 How does it compare to boehm?
 Boehm is a drop in replacement for malloc. Autozone requires changes to  
 the compiler to emit write barriers, and enforces certain coding  
 constructs - so it doesn't "just work" and requires some changes to  
 adopt it. But autozone can be more efficient at collecting as a result.



Yeah, it's the write barriers that are the problem. I can see such a collector working in SafeD, but likely not in D proper.
Apr 07 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Sean Kelly wrote:
 Yeah, it's the write barriers that are the problem.  I can see such a
 collector working in SafeD, but likely not in D proper.

Write barriers trade off improved collection times for worse computation times.
Apr 07 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-04-08 02:35:12 -0400, Walter Bright <newshound1 digitalmars.com> said:

 Sean Kelly wrote:
 Yeah, it's the write barriers that are the problem.  I can see such a
 collector working in SafeD, but likely not in D proper.

Write barriers trade off improved collection times for worse computation times.

Essentially, Apple traded the previous reference-counted system in Objective-C which requires memory barriers to update reference counts for a garbage-collected system which also requires memory barriers, but can deallocate objects in a separate thread. It's almost no tradeoff really. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 08 2010
parent Sean Kelly <sean invisibleduck.org> writes:
Michel Fortin Wrote:

 On 2010-04-08 02:35:12 -0400, Walter Bright <newshound1 digitalmars.com> said:
 
 Sean Kelly wrote:
 Yeah, it's the write barriers that are the problem.  I can see such a
 collector working in SafeD, but likely not in D proper.

Write barriers trade off improved collection times for worse computation times.

Essentially, Apple traded the previous reference-counted system in Objective-C which requires memory barriers to update reference counts for a garbage-collected system which also requires memory barriers, but can deallocate objects in a separate thread. It's almost no tradeoff really.

There's no problem with cycles using a scanning GC though.
Apr 08 2010
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 4/7/10 22:58, Robert Jacques wrote:
 On Wed, 07 Apr 2010 17:34:09 -0300, Jacob Carlborg <doob me.com> wrote:

 On 4/7/10 20:40, Sean Kelly wrote:
 Daniel Oberhoff Wrote:
 b) my reluctance to the dependency on a complex runtime as the one d i
 is bringing at least due to its garbage collector

 b) worries me a little. I am working towards real time systems with
 tight time and sometimes also tight memory constraints, and a
 conservative stop-the-world collector seems a bit daunting in this
 context. is it reasonable to work without the collector, or are there
 plans to upgrade to a concurrent one. also are there extensive
 performance tests as how badly the collector interrupts real-time
 processing?

It's still possible to build druntime with a custom GC. You can even have a "GC" that simply calls malloc/free if you avoid coding that relies on implicit collection of discarded memory. See gc_stub for an example. As for better GC implementations, there are a bunch of options, but I don't know that we can go so far as an incremental collector ala Java. That D can call C code causes problems there.

Maybe something like the AutoZone collector on Mac OS X: http://www.opensource.apple.com/source/autozone/autozone-77.1/README.html?f=text "AutoZone is a scanning, conservative, generational, multi-threaded garbage collector." "... the implementation is language agnostic". "The AutoZone collector is implemented in C++ and is designed to work in a runtime where some or most of the application's memory may be managed by mechanisms other than the collector."

Via reddit:
 How does it compare to boehm?
 Boehm is a drop in replacement for malloc. Autozone requires changes
 to the compiler to emit write barriers, and enforces certain coding
 constructs - so it doesn't "just work" and requires some changes to
 adopt it. But autozone can be more efficient at collecting as a result.
 Also autozone collects and runs finalizers in a background thread -
 AFAIK Boehm does not.It's better for interactive desktop apps. Boehm
 is better for long-running server processes. The trade off is more
 fragmentation with AutoZone for faster speeds for relatively
 short-living desktop apps.


So autozone is language agnostic but not compiler agnostic, so calling C functions or using assembly is still a problem.

I suggested autozone because I assume we could make the necessary modifications to the D compiler to support a similar gc. And since it's designed to work in a runtime where most of the other applications don't use a gc it would be a good choice for D. But I don't know much about these things.
Apr 08 2010
prev sibling next sibling parent Daniel Oberhoff <daniel danieloberhoff.de> writes:
On 2010-04-07 20:40:27 +0200, Sean Kelly <sean invisibleduck.org> said:
 
 It's still possible to build druntime with a custom GC.  You can even 
 have a "GC" that simply calls malloc/free if you avoid coding that 
 relies on implicit collection of discarded memory.  See gc_stub for an 
 example.  As for better GC implementations, there are a bunch of 
 options, but I don't know that we can go so far as an incremental 
 collector ala Java.  That D can call C code causes problems there.

ah, I had been thinking about that. actually, you can call c code from java and .net. The point is that you have to take care about the memory you pass into the c function, and that which you get out. so there must be a sane way, right? Best Daniel
Apr 07 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 07 Apr 2010 18:01:47 -0300, Daniel Oberhoff  
<daniel danieloberhoff.de> wrote:
 On 2010-04-07 20:40:27 +0200, Sean Kelly <sean invisibleduck.org> said:
  It's still possible to build druntime with a custom GC.  You can even  
 have a "GC" that simply calls malloc/free if you avoid coding that  
 relies on implicit collection of discarded memory.  See gc_stub for an  
 example.  As for better GC implementations, there are a bunch of  
 options, but I don't know that we can go so far as an incremental  
 collector ala Java.  That D can call C code causes problems there.

ah, I had been thinking about that. actually, you can call c code from java and .net. The point is that you have to take care about the memory you pass into the c function, and that which you get out. so there must be a sane way, right? Best Daniel

Nope, not really. .NET C integration is done via message passing (aka marshaling) or via COM objects. I would hazard Java does it similarly. This is both slow, inefficient and somewhat limiting. Furthermore, given that all too often you have to manually write custom marshallers, its also a big pain in the butt. Beyond C, D also supports assembler as a core language feature.
Apr 07 2010
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 07 Apr 2010 17:34:09 -0300, Jacob Carlborg <doob me.com> wrote:

 On 4/7/10 20:40, Sean Kelly wrote:
 Daniel Oberhoff Wrote:
 b) my reluctance to the dependency on a complex runtime as the one d i
 is bringing at least due to its garbage collector

 b) worries me a little. I am working towards real time systems with
 tight time and sometimes also tight memory constraints, and a
 conservative stop-the-world collector seems a bit daunting in this
 context. is it reasonable to work without the collector, or are there
 plans to upgrade to a concurrent one. also are there extensive
 performance tests as how badly the collector interrupts real-time
 processing?

It's still possible to build druntime with a custom GC. You can even have a "GC" that simply calls malloc/free if you avoid coding that relies on implicit collection of discarded memory. See gc_stub for an example. As for better GC implementations, there are a bunch of options, but I don't know that we can go so far as an incremental collector ala Java. That D can call C code causes problems there.

Maybe something like the AutoZone collector on Mac OS X: http://www.opensource.apple.com/source/autozone/autozone-77.1/README.html?f=text "AutoZone is a scanning, conservative, generational, multi-threaded garbage collector." "... the implementation is language agnostic". "The AutoZone collector is implemented in C++ and is designed to work in a runtime where some or most of the application's memory may be managed by mechanisms other than the collector."

Via reddit:
 How does it compare to boehm?
 Boehm is a drop in replacement for malloc. Autozone requires changes to  
 the compiler to emit write barriers, and enforces certain coding  
 constructs - so it doesn't "just work" and requires some changes to  
 adopt it. But autozone can be more efficient at collecting as a result.
 Also autozone collects and runs finalizers in a background thread -  
 AFAIK Boehm does not.It's better for interactive desktop apps. Boehm is  
 better for long-running server processes. The trade off is more  
 fragmentation with AutoZone for faster speeds for relatively  
 short-living desktop apps.


So autozone is language agnostic but not compiler agnostic, so calling C functions or using assembly is still a problem.
Apr 07 2010
prev sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Wed, 07 Apr 2010 15:40:27 -0300, Sean Kelly <sean invisibleduck.org>  
wrote:
 Daniel Oberhoff Wrote:
 b) my reluctance to the dependency on a complex runtime as the one d i
 is bringing at least due to its garbage collector

 b) worries me a little. I am working towards real time systems with
 tight time and sometimes also tight memory constraints, and a
 conservative stop-the-world collector seems a bit daunting in this
 context. is it reasonable to work without the collector, or are there
 plans to upgrade to a concurrent one. also are there extensive
 performance tests as how badly the collector interrupts real-time
 processing?

It's still possible to build druntime with a custom GC. You can even have a "GC" that simply calls malloc/free if you avoid coding that relies on implicit collection of discarded memory. See gc_stub for an example. As for better GC implementations, there are a bunch of options, but I don't know that we can go so far as an incremental collector ala Java. That D can call C code causes problems there.

Incremental / generational GCs also have problems with impure/non-OO languages like D. Finding the right GC flags is reasonable when there are only references to GC aware objects. Once you have to start putting the flags somewhere else, global locks tend to get involved.
Apr 07 2010