www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - How much data can the GC handle ?

reply llothar <llothar_member pathlink.com> writes:
Hello,

i've never seen an answer to this question (and i didn't search very much):

How large can the data set become until the GC consumes to much space or is it
O(1) ? I doubt this. Has anybody ever written a program which handles dozens of
millions objects with hunderts of megabytes.

I thought about writing a test program, but i gave up because it is very
difficult to emulate a typical use pattern. Sure every program is different, but
i think a real world program might be better then any synthetic benchmark
Jul 24 2005
next sibling parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"llothar" <llothar_member pathlink.com> wrote in message 
news:dc0k8e$3df$1 digitaldaemon.com...
 Hello,

 i've never seen an answer to this question (and i didn't search very 
 much):

 How large can the data set become until the GC consumes to much space or 
 is it
 O(1) ? I doubt this. Has anybody ever written a program which handles 
 dozens of
 millions objects with hunderts of megabytes.

 I thought about writing a test program, but i gave up because it is very
 difficult to emulate a typical use pattern. Sure every program is 
 different, but
 i think a real world program might be better then any synthetic benchmark

Maybe I don't get the question, but the amount of space available is proportional to the amount of memory (and virtual memory) available for any application. Are you asking how big the GC overhead is per allocation? Note the overhead depends on the size requested. For small allocations it is very efficient since it splits large blocks into small chunks - it could be an overhead of something like 1 bit per 8 bytes or something. For larger allocations the constant-size header is not noticeable.
Jul 24 2005
parent reply =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?= writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ben Hinkle schrieb:
 "llothar" <llothar_member pathlink.com> wrote in message 
 news:dc0k8e$3df$1 digitaldaemon.com...
 
 Hello,
 
 i've never seen an answer to this question (and i didn't search
 very much):
 
 How large can the data set become until the GC consumes to much
 space or is it O(1) ? I doubt this. Has anybody ever written a
 program which handles dozens of millions objects with hunderts of
 megabytes.


The much more interesting question is: How efficient is the GC at detecting unused memory and reclaiming it? A very basic sample: void test(){ size_t a; size_t* b; b = &a; a = cast(size_t) b; } Will the GC reclaim a and b after exiting test() and calling std.gc.minimize()?
 I thought about writing a test program, but i gave up because it is
 very difficult to emulate a typical use pattern.


Maybe you should start with synthetic tests: What happens if a program uses a large amount of ints? Has a different nesting any influence on the GC? ... That way you could not only detect that there are potential problems but also identify/locate them. Thomas -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (MingW32) iD8DBQFC5IxI3w+/yD4P9tIRAi1JAKCBCwvfGWX1A9cUZWo430WjZXzk1gCgkA9j eotRSqGwgNf4hmSR1MIWnvA= =arEk -----END PGP SIGNATURE-----
Jul 24 2005
parent reply llothar <llothar_member pathlink.com> writes:
In article <dc226e$169l$1 digitaldaemon.com>, =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?=
says...

The much more interesting question is:
How efficient is the GC at detecting unused memory and reclaiming it?

Thats why i asked in the past if D does emit typehints for the allocated structures. At the moment D does not. Very bad. I found a message on the GC mailing list that this is a problem for example when allocating large floating point arrays. The number of false positives was so high that the application became unuseable, but a "GC_atom_malloc" on the array removed this problem. Hope that D implements the type hinting and correct use of "GC_malloc" soon, because IMHO this is a serious mission critical problem.
Jul 25 2005
parent Larry Evans <cppljevans cos-internet.com> writes:
On 07/25/2005 07:38 AM, llothar wrote:
 In article <dc226e$169l$1 digitaldaemon.com>, =?UTF-8?B?VGhvbWFzIEvDvGhuZQ==?=
 says...
 
 
The much more interesting question is:
How efficient is the GC at detecting unused memory and reclaiming it?

Thats why i asked in the past if D does emit typehints for the allocated structures. At the moment D does not. Very bad. I found a message on the GC mailing list that this is a problem for example when allocating large floating point arrays. The number of false positives was so high that the application became unuseable, but a "GC_atom_malloc" on the array removed this problem. Hope that D implements the type hinting and correct use of "GC_malloc" soon, because IMHO this is a serious mission critical problem.

http://cvs.sourceforge.net/viewcvs.py/boost-sandbox/boost-sandbox/boost/policy_ptr/ would provide this hint in the form of: selected_fields_description_of < FieldsVisitor , record_type >::ptr() where record_type is the type of the structure, and FieldsVisitor is some type with member functions: template<class PolicyPtr> void visit_field(PolicyPtr& a_ptr); where PolicyPtr is the type of fields in record_type which are "visited" by FieldsVisitor. Such a FieldVisitor could be a type of garbage collector which, as part of visit_field, marks the referent pointed to by a_ptr and then traverses the referent, of type PolicyPtr::referent_type, using: selected_fields_description_of < FieldsVisitor , PolicyPtr::referent_type >::ptr() thus allowing, a precise (as opposed to "conservative") scan of the heap.
Oct 15 2005
prev sibling parent Marcio <mqmnews123 sglebs.com> writes:
 How large can the data set become until the GC consumes to much space or is it
 O(1) ? I doubt this. Has anybody ever written a program which handles dozens of
 millions objects with hunderts of megabytes.

Somewhat related to your question, but not D-specific: http://www.cs.umass.edu/~emery/pubs/04-17.pdf marcio
Jul 25 2005