www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - Problem on GC and Big arrays on heap

reply k.inaba <k.inaba_member pathlink.com> writes:
On current dmd and phobos implementation, big arrays allocated on GC heap often
fail to be deallocated. For example, running the following code:

import std.gc;
void main() {
for(uint i = 1; i <= 100; ++i) { char[] mem = new char[1024*1024*20]; }
std.gc.fullCollect();
..
}

most of the mem[]s are not deallocated and stays on memory.

According to the -debug=LOGGING output from phobos gc, the mem[]s are claimed to
be referred from the static data segment. More specifically, (on my environment)
the bit patterns 0x02020202, 0x03030303, etc. on UTF8stride[] at std/urf.d or
the other bit patterns in _ctype[] at std/ctype.d are wrongly recognized as
pointers that point into the mem[]s.

I know this is the famous limitation of conservative GC, but aren't there any
way to remedy this case - for example by separating the segment for non-pointer
global data and the segment for global pointers?

Currently, the bit patterns contained in UTF8stride[] almost surely captures
most of the >16MB arrays, and this may cause a severe problem in practice. (Of
course any other flag tables may cause the same problem, but I can manage to
avoid them by allocating such tables on non-gc heap. But I cannot escape from
the global tables in the standard library!)
Jan 15 2006
parent "Walter Bright" <newshound digitalmars.com> writes:
"k.inaba" <k.inaba_member pathlink.com> wrote in message 
news:dqfb2e$f7b$1 digitaldaemon.com...
 According to the -debug=LOGGING output from phobos gc, the mem[]s are 
 claimed to
 be referred from the static data segment. More specifically, (on my 
 environment)
 the bit patterns 0x02020202, 0x03030303, etc. on UTF8stride[] at std/urf.d 
 or
 the other bit patterns in _ctype[] at std/ctype.d are wrongly recognized 
 as
 pointers that point into the mem[]s.

 I know this is the famous limitation of conservative GC, but aren't there 
 any
 way to remedy this case - for example by separating the segment for 
 non-pointer
 global data and the segment for global pointers?
Perhaps.
 Currently, the bit patterns contained in UTF8stride[] almost surely 
 captures
 most of the >16MB arrays, and this may cause a severe problem in practice. 
 (Of
 course any other flag tables may cause the same problem, but I can manage 
 to
 avoid them by allocating such tables on non-gc heap. But I cannot escape 
 from
 the global tables in the standard library!)
Garbage collection tends to come unglued with allocations that are large relative to the address space. The best way to deal with this is to handle such large allocations explicitly, i.e. explicitly delete them when the program is done with them. Or, you can only access the large chunk through a small class. Then, in the destructor for the small class, explicitly delete the large chunk.
Jan 23 2006