digitalmars.D - Changes to the Tango runtime / GC
- Sean Kelly <sean f4.ca> Oct 13 2007
- Robert Fraser <fraserofthenight gmail.com> Oct 13 2007
- 0ffh <spam frankhirsch.net> Oct 13 2007
- Sean Kelly <sean f4.ca> Oct 13 2007
- David Brown <dlang davidb.org> Oct 13 2007
- Bruno Medeiros <brunodomedeiros+spam com.gmail> Oct 14 2007
- 0ffh <spam frankhirsch.net> Oct 14 2007
- Bruno Medeiros <brunodomedeiros+spam com.gmail> Oct 15 2007
- Sean Kelly <sean f4.ca> Oct 15 2007
All the recent talk about the GC inspired me to make some changes that
I'd been planning for Tango. I wanted to post about them here to make
sure everyone was aware of them because they are silent changes that
could affect the reliability of a program in certain rare circumstances.
Previously, the Tango runtime worked just like the Phobos runtime in
that the hasPointers flag was set or cleared on array operations based
on the type of the variable referencing the underlying memory. For example:
byte[] b = cast(byte[]) new void[5]; // hasPointers is set
byte[] c = b[1 .. $];
b.length = 10; // hasPointers is retained because no realloc occurs
b.length = 20; // hasPointers is lost/cleared because of a realloc
c.length = 1; // hasPointers is lost/cleared--slices always realloc
c.length = 20; // hasPointers is lost/cleared for same reason as above
The opposite was also true:
void[] v = new byte[5]; // hasPointers is not set
void[] w = v[1 .. $];
v.length = 10; // hasPointers is still 0 because no realloc occurs
v.length = 20; // hasPointers is set because of a realloc
w.length = 1; // hasPointers is set because slices always realloc
w.length = 20; // hasPointers is set for same reason as above
The new behavior of Tango is to preserve block attributes for any
allocated block whose size is simply changing as the result of an array
operation. *** This is true of both slices and normal arrays. *** For
example:
byte[] b = cast(byte[]) new void[5]; // hasPointers is set
byte[] c = b[1 .. $];
b.length = 10; // hasPointers is retained because no realloc occurs
b.length = 20; // hasPointers is retained because of new logic
c.length = 1; // hasPointers is retained because of new logic
c.length = 20; // hasPointers is retained because of new logic
The same behavior is true of void references to byte arrays. Here is a
quick run-down of more complex cases:
byte[] b = cast(byte[]) new void[5];
b ~= 0; // hasPointers is preserved on realloc
b ~= [0,1,2]; // hasPointers is preserved on realloc
b = [0,1] ~ [2]; // hasPointers is not retained - A
b = null;
b.length = 10; // hasPointers is lost because reference was cleared
In situation A above, hasPointers is not retained because the operation
is an assignment rather than a resize.
I believe these changes will result in more predictable behavior than
before. However, they can cause a change in program behavior in rare cases:
void[] getBlock()
{
return new byte[32];
}
With the old behavior, manipulating the block returned by getBlock in a
way that caused a reallocation to occur would result in hasPointers
being set on that block. With the new behavior, hasPointers would not
be set because the runtime would be preserving the bits set on the
original block, which was allocated as a byte[]. Thus it's possible in
certain degenerate cases that the bits set on a memory block could
silently propagate and "poison" an application in instances where they
were discarded before.
Sean
Oct 13 2007
Sean Kelly Wrote:All the recent talk about the GC inspired me to make some changes that I'd been planning for Tango. I wanted to post about them here to make sure everyone was aware of them because they are silent changes that could affect the reliability of a program in certain rare circumstances. Previously, the Tango runtime worked just like the Phobos runtime in that the hasPointers flag was set or cleared on array operations based on the type of the variable referencing the underlying memory. For example: byte[] b = cast(byte[]) new void[5]; // hasPointers is set byte[] c = b[1 .. $]; b.length = 10; // hasPointers is retained because no realloc occurs b.length = 20; // hasPointers is lost/cleared because of a realloc c.length = 1; // hasPointers is lost/cleared--slices always realloc c.length = 20; // hasPointers is lost/cleared for same reason as above The opposite was also true: void[] v = new byte[5]; // hasPointers is not set void[] w = v[1 .. $]; v.length = 10; // hasPointers is still 0 because no realloc occurs v.length = 20; // hasPointers is set because of a realloc w.length = 1; // hasPointers is set because slices always realloc w.length = 20; // hasPointers is set for same reason as above The new behavior of Tango is to preserve block attributes for any allocated block whose size is simply changing as the result of an array operation. *** This is true of both slices and normal arrays. *** For example: byte[] b = cast(byte[]) new void[5]; // hasPointers is set byte[] c = b[1 .. $]; b.length = 10; // hasPointers is retained because no realloc occurs b.length = 20; // hasPointers is retained because of new logic c.length = 1; // hasPointers is retained because of new logic c.length = 20; // hasPointers is retained because of new logic The same behavior is true of void references to byte arrays. Here is a quick run-down of more complex cases: byte[] b = cast(byte[]) new void[5]; b ~= 0; // hasPointers is preserved on realloc b ~= [0,1,2]; // hasPointers is preserved on realloc b = [0,1] ~ [2]; // hasPointers is not retained - A b = null; b.length = 10; // hasPointers is lost because reference was cleared In situation A above, hasPointers is not retained because the operation is an assignment rather than a resize. I believe these changes will result in more predictable behavior than before. However, they can cause a change in program behavior in rare cases: void[] getBlock() { return new byte[32]; } With the old behavior, manipulating the block returned by getBlock in a way that caused a reallocation to occur would result in hasPointers being set on that block. With the new behavior, hasPointers would not be set because the runtime would be preserving the bits set on the original block, which was allocated as a byte[]. Thus it's possible in certain degenerate cases that the bits set on a memory block could silently propagate and "poison" an application in instances where they were discarded before. Sean
So, wait, under this new regime, there will be *more* chances of random data causing memory leaks?
Oct 13 2007
Robert Fraser wrote:So, wait, under this new regime, there will be *more* chances of random data causing memory leaks?
My /guess/ is: Neither reliably more nor less, just different... :) Regards, Frank
Oct 13 2007
Robert Fraser wrote:So, wait, under this new regime, there will be *more* chances of random data causing memory leaks?
No, fewer, because the behavior of the hasPointers attribute will be more predictable. Sean
Oct 13 2007
On Sat, Oct 13, 2007 at 08:38:54AM -0700, Sean Kelly wrote:The new behavior of Tango is to preserve block attributes for any allocated block whose size is simply changing as the result of an array operation. *** This is true of both slices and normal arrays. *** For example:
As I understand, this will means that GrowBuffer's won't suddently start having pointers after the first resize operation. I think this behavior is a good thing. The attribute should be an aspect of the data, not the particular things that are pointing to it. David
Oct 13 2007
Sean Kelly wrote:byte[] b = cast(byte[]) new void[5]; // hasPointers is set byte[] c = b[1 .. $]; b.length = 10; // hasPointers is retained because no realloc occurs b.length = 20; // hasPointers is retained because of new logic c.length = 1; // hasPointers is retained because of new logic c.length = 20; // hasPointers is retained because of new logic
So, is there a way to allocate an uninitialized byte[] that has hasPointer set to false? -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 14 2007
Bruno Medeiros wrote:So, is there a way to allocate an uninitialized byte[] that has hasPointer set to false?
I'd say byte[] defaults to hasPointers=false, so you wouldn't have to do anything special to make it so. (It is the void[] that usually has hasPtrs=true.) But, you can init byte[] with hasPtrs=true with: byte[] x=cast(byte[])(new void[0x100); Or you can init a void[] with hasPtrs=false with: void[] x=cast(void[])(new byte[0x100); The difference in the new Tango RT is it makes hasPtrs "sticky". All these details were in: news://news.digitalmars.com:119/feqoqc$30ji$1 digitalmars.com Regards, Frank
Oct 14 2007
0ffh wrote:Bruno Medeiros wrote:So, is there a way to allocate an uninitialized byte[] that has hasPointer set to false?
I'd say byte[] defaults to hasPointers=false, so you wouldn't have to do anything special to make it so. (It is the void[] that usually has hasPtrs=true.)
Agh nevermind, I thought the purpose of allocating void[] was to allocate an uninitialized array, but that's what the void initializer does instead. void[] is the same as byte[] but with hasPointers=true then? I couldn't find about that in the doc. -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Oct 15 2007
Bruno Medeiros wrote:0ffh wrote:Bruno Medeiros wrote:So, is there a way to allocate an uninitialized byte[] that has hasPointer set to false?
I'd say byte[] defaults to hasPointers=false, so you wouldn't have to do anything special to make it so. (It is the void[] that usually has hasPtrs=true.)
Agh nevermind, I thought the purpose of allocating void[] was to allocate an uninitialized array, but that's what the void initializer does instead. void[] is the same as byte[] but with hasPointers=true then? I couldn't find about that in the doc.
I don't think it is in the docs, but it is in the 1.0x changelog somewhere. The only other difference is that, like void*, void[] can accept some other types of data without a cast. As far as I know, there is no way to allocate an uninitialized dynamic array using 'new' though. I don't imagine this works? byte[] x = new byte[8] = void; This is one reason that Tango offers access to GC.malloc and GC.calloc routines as a means to directly allocate either uninitialized or zero-initialized memory. Sean
Oct 15 2007









0ffh <spam frankhirsch.net> 