digitalmars.D.learn - GC.calloc(), then what?

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (12/12) Jun 27 2014 1) After allocating memory by GC.calloc() to place objects on it, what

safety0ff (30/43) Jun 27 2014 Add root creates an internal reference within the GC to the

safety0ff (2/2) Jun 27 2014 I realize that my answer isn't completely clear in some cases, if

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (3/5) Jun 27 2014 Done! That's why we are here anyway. :p

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (39/78) Jun 27 2014 That much I know. :) I have actually finished the first draft of

safety0ff (30/79) Jun 27 2014 I know you're a knowledgeable person in the D community, I may

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/10) Jun 27 2014 I appreciated your answers, which were very helpful. What I meant was, I...

safety0ff (2/5) Jun 28 2014 Yea, I understood what you meant. :)

safety0ff (6/10) Jun 27 2014 Yes.

safety0ff (4/9) Jun 27 2014 Actually, I just realized that I was wrong in saying "the memory

eles (10/17) Jun 27 2014 It is not about that, but about the fact that this unmanaged

Sean Kelly (14/18) Jun 27 2014 And possibly set BlkInfo flags to indicate whether the block has

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

1) After allocating memory by GC.calloc() to place objects on it, what 
else should one do? In what situations does one need to call addRoot() 
or addRange()?

2) Does the answer to the previous question differ for struct objects 
versus class objects?

3) Is there a difference between core.stdc.stdlib.calloc() and 
GC.calloc() in that regard? Which one to use in what situation?

4) Are the random bit patterns in a malloc()'ed memory always a concern 
for false pointers? Does that become a concern after calling addRoot() 
or addRange()? If so, why would anyone ever malloc() instead of always 
calloc()'ing?

Ali

Jun 27 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:
 1) After allocating memory by GC.calloc() to place objects on 
 it, what else should one do?

Use std.conv.emplace.

 In what situations does one need to call addRoot() or 
 addRange()?

Add root creates an internal reference within the GC to the 
memory pointed by the argument (void* p.)
This pins the memory so that it won't be collected by the GC. 
E.g. you're going to pass a string to an extern C function, and 
the function will store a pointer to the string within its own 
data structures. Since the GC won't have access to the data 
structures, you must addRoot it to avoid creating a dangling 
pointer in the C data structure.

Add range is usually for cases when you use 
stdc.stdlib.malloc/calloc and place pointers to GC managed memory 
within that memory. This allows the GC to scan that memory for 
pointers during collection, otherwise it may reclaim memory which 
is pointed to my malloc'd memory.

 2) Does the answer to the previous question differ for struct 
 objects versus class objects?

No.

 3) Is there a difference between core.stdc.stdlib.calloc() and 
 GC.calloc() in that regard? Which one to use in what situation?

One is GC managed, the other is not. calloc simply means the 
memory is pre-zero'd, it has nothing to do with "C allocation" / 
"allocation in the C language"

 4) Are the random bit patterns in a malloc()'ed memory always a 
 concern for false pointers? Does that become a concern after 
 calling addRoot() or addRange()?

If by malloc you're talking about stdc.stdlib.malloc then:
It only becomes a concern after you call addRange, and the false 
pointers potential is only present within the range you gave to 
addRange.
So if you over-allocate using malloc and give the entire memory 
range to addRange, then any false pointers in the un-intialized 
portion become a concern.

If you're talking about GC.malloc():
Currently the GC zeros the memory unless you allocate NO_SCAN 
memory, so it only differs in the NO_SCAN case.

 If so, why would anyone ever malloc() instead of always 
 calloc()'ing?

To save on redundant zero'ing.

Jun 27 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

I realize that my answer isn't completely clear in some cases, if 
you still have questions, ask away.

Jun 27 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 06/27/2014 12:53 AM, safety0ff wrote:
 I realize that my answer isn't completely clear in some cases, if you
 still have questions, ask away.

Done! That's why we are here anyway. :p

Ali

Jun 27 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

Thank you for your responses. I am partly enlightened. :p

On 06/27/2014 12:34 AM, safety0ff wrote:

 On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:
 1) After allocating memory by GC.calloc() to place objects on it, what
 else should one do?

 Use std.conv.emplace.

That much I know. :) I have actually finished the first draft of 
translating my memory management chapter (the last one in the book!) and 
trying to make sure that the information is correct.

 In what situations does one need to call addRoot() or addRange()?

 Add root creates an internal reference within the GC to the memory
 pointed by the argument (void* p.)
 This pins the memory so that it won't be collected by the GC. E.g.
 you're going to pass a string to an extern C function, and the function
 will store a pointer to the string within its own data structures. Since
 the GC won't have access to the data structures, you must addRoot it to
 avoid creating a dangling pointer in the C data structure.

Additionally and according to the documentation, any other GC blocks 
will be considered live. So, addRoot makes a true roots where the GC 
starts its scanning from.

 Add range is usually for cases when you use stdc.stdlib.malloc/calloc
 and place pointers to GC managed memory within that memory. This allows
 the GC to scan that memory for pointers during collection, otherwise it
 may reclaim memory which is pointed to my malloc'd memory.

One part that I don't understand in the documentation is "if p points 
into a GC-managed memory block, addRange does not mark this block as live".



Does that mean that if I have objects in my addRange'd memory that in 
turn have references to objects in the GC-managed memory, my references 
in my memory may be stale?

If so, does that mean that if I manage objects in my memory, all their 
members should be managed by me as well?

This seems to bring two types of GC-managed memory:

1) addRoot'ed memory that gets scanned deep (references are followed)

2) addRange'd memory that gets scanned shallow (references are not followed)

See, that's confusing: What does that mean? I still hold the memory 
block anyway; what does the GC achieve by scanning my memory if it's not 
going to follow references anyway?

 2) Does the answer to the previous question differ for struct objects
 versus class objects?

 No.

 3) Is there a difference between core.stdc.stdlib.calloc() and
 GC.calloc() in that regard? Which one to use in what situation?

 One is GC managed, the other is not. calloc simply means the memory is
 pre-zero'd, it has nothing to do with "C allocation" / "allocation in
 the C language"

I know even that much. ;) I find people's malloc+memset code amusing.

 4) Are the random bit patterns in a malloc()'ed memory always a
 concern for false pointers? Does that become a concern after calling
 addRoot() or addRange()?

 If by malloc you're talking about stdc.stdlib.malloc then:
 It only becomes a concern after you call addRange,

But addRange doesn't seem to make sense for stdlib.malloc'ed memory, 
right? The reason is, that memory is not managed by the GC so there is 
no danger of losing that memory due to a collection anyway. It will go 
away only when I call stdlib.free.

 and the false
 pointers potential is only present within the range you gave to addRange.
 So if you over-allocate using malloc and give the entire memory range to
 addRange, then any false pointers in the un-intialized portion become a
 concern.

Repeating myself, that makes sense but I don't see when I would need 
addRange on a stdlib.malloc'ed memory.

 If you're talking about GC.malloc():
 Currently the GC zeros the memory unless you allocate NO_SCAN memory, so
 it only differs in the NO_SCAN case.

So, the GC's default behavior is to scan the memory, necessitating 
clearing the contents? That seems to make GC.malloc() behave the same as 
GC.calloc() by default, doesn't it?

So, is this guideline right?

   "GC.malloc() makes sense only with NO_SCAN."

 If so, why would anyone ever malloc() instead of always calloc()'ing?

 To save on redundant zero'ing.

And again, redundant zero'ing is saved only when used with NO_SCAN.

I think I finally understand the main difference between stdlib.malloc 
and GC.malloc: The latter gets collected by the GC.

Another question: Do GC.malloc'ed and GC.calloc'ed memory scanned deep?

Ali

Jun 27 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
 Thank you for your responses. I am partly enlightened. :p

I know you're a knowledgeable person in the D community, I may 
have stated many things you already knew, but I tried to answer 
the questions as-is.


 On 06/27/2014 12:34 AM, safety0ff wrote:

 Add range is usually for cases when you use

 stdc.stdlib.malloc/calloc
 and place pointers to GC managed memory within that memory.

 This allows
 the GC to scan that memory for pointers during collection,

 otherwise it
 may reclaim memory which is pointed to my malloc'd memory.

 One part that I don't understand in the documentation is "if p 
 points into a GC-managed memory block, addRange does not mark 
 this block as live".

 [SNIP]

 See, that's confusing: What does that mean? I still hold the 
 memory block anyway; what does the GC achieve by scanning my 
 memory if it's not going to follow references anyway?

The GC _will_ follow references (i.e. scan deeply,) that's the 
whole point of addRange.
What that documentation is saying is that:

If you pass a range R through addRange, and R lies in the GC 
heap, then once there are no pointers (roots) to R, the GC will 
collect it anyway regardless that you called addRange on it.

In other words, prefer using addRoot for GC memory and addRange 
for non-GC memory.


 4) Are the random bit patterns in a malloc()'ed memory


 always a
 concern for false pointers? Does that become a concern after


 calling
 addRoot() or addRange()?

 If by malloc you're talking about stdc.stdlib.malloc then:
 It only becomes a concern after you call addRange,

 But addRange doesn't seem to make sense for stdlib.malloc'ed 
 memory, right? The reason is, that memory is not managed by the 
 GC so there is no danger of losing that memory due to a 
 collection anyway. It will go away only when I call stdlib.free.

addRange almost exclusively makes sense with stdlib.malloc'ed 
memory.
As you've stated: If you pass it GC memory it does not mark the 
block as live.

I believe the answer above clears things up: the GC does scan the 
range, and scanning is always "deep" (i.e. when it finds pointers 
to unmarked GC memory, it marks them.)

Conversely, addRoot exclusively makes sense with GC memory.

 If you're talking about GC.malloc():
 Currently the GC zeros the memory unless you allocate NO_SCAN

 memory, so
 it only differs in the NO_SCAN case.

 So, the GC's default behavior is to scan the memory, 
 necessitating clearing the contents? That seems to make 
 GC.malloc() behave the same as GC.calloc() by default, doesn't 
 it?


I don't believe it's necessary to clear it, it's just a measure 
against false pointers (AFAIK.)


 So, is this guideline right?

   "GC.malloc() makes sense only with NO_SCAN."


I wouldn't make a guideline like that, just say that: if you want 
the memory to be guaranteed to be zero'd use GC.calloc.

However, due to GC internals (for preventing false pointers,) 
GC.malloc'd memory  will often be zero'd anyway.

 If so, why would anyone ever malloc() instead of always


 calloc()'ing?
 To save on redundant zero'ing.

 And again, redundant zero'ing is saved only when used with 
 NO_SCAN.

Yup.

 I think I finally understand the main difference between 
 stdlib.malloc and GC.malloc: The latter gets collected by the 
 GC.

Yup.

 Another question: Do GC.malloc'ed and GC.calloc'ed memory 
 scanned deep?

Yes, only NO_SCAN memory doesn't get scanned, everything else 
does.

Jun 27 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 06/27/2014 01:49 AM, safety0ff wrote:

 On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
 Thank you for your responses. I am partly enlightened. :p

 I know you're a knowledgeable person in the D community, I may have
 stated many things you already knew, but I tried to answer the questions
 as-is.

I appreciated your answers, which were very helpful. What I meant was, I 
was "partially" enlightened but still had some questions. I am in much 
better shape now. :)

Ali

Jun 27 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 27 June 2014 at 23:26:55 UTC, Ali Çehreli wrote:
 I appreciated your answers, which were very helpful. What I 
 meant was, I was "partially" enlightened but still had some 
 questions. I am in much better shape now. :)

Yea, I understood what you meant. :)

Jun 28 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
 So, the GC's default behavior is to scan the memory, 
 necessitating clearing the contents? That seems to make 
 GC.malloc() behave the same as GC.calloc() by default, doesn't 
 it?

Yes.
compare:
https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L543
to:
https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L419

Jun 27 2014

"safety0ff" <safety0ff.dev gmail.com> writes:

On Friday, 27 June 2014 at 09:20:53 UTC, safety0ff wrote:
 Yes.
 compare:
 https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L543
 to:
 https://github.com/D-Programming-Language/druntime/blob/master/src/gc/gc.d#L419

Actually, I just realized that I was wrong in saying "the memory 
likely be cleared by malloc" it's only the overallocation that 
gets cleared.

Jun 27 2014

"eles" <eles eles.com> writes:

On Friday, 27 June 2014 at 08:17:07 UTC, Ali Çehreli wrote:
 Thank you for your responses. I am partly enlightened. :p

 On 06/27/2014 12:34 AM, safety0ff wrote:

 On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:


 But addRange doesn't seem to make sense for stdlib.malloc'ed 
 memory, right? The reason is, that memory is not managed by the 
 GC so there is no danger of losing that memory due to a 
 collection anyway. It will go away only when I call stdlib.free.

It is not about that, but about the fact that this unmanaged 
memory *might contain* references towards managed memory.

If you intend to place such references into this particular chunk 
of memory, then you need to tell GC to scan the memory chunk for 
references towards managed memory.

Otherwise, the GC might ignore this chunk of memory, find 
elsewhere no references towards a managed object, delete the 
managed object, then your pointer placed in the unmanaged memory 
becomes dangling.

Jun 27 2014

"Sean Kelly" <sean invisibleduck.org> writes:

On Friday, 27 June 2014 at 07:34:55 UTC, safety0ff wrote:
 On Friday, 27 June 2014 at 07:03:28 UTC, Ali Çehreli wrote:
 1) After allocating memory by GC.calloc() to place objects on 
 it, what else should one do?

 Use std.conv.emplace.

And possibly set BlkInfo flags to indicate whether the block has
pointers, and the finalize flag to indicate that it's an object.
I'd look at _d_newclass in Druntime/src/rt/lifetime.d for the
specifics.

To be honest, I think the GC interface is horribly outdated, but
my proposal for a redesign (first in 2010, then again in 2012 and
once again in 2013) never gained traction.  In short, what I'd
really like to have is a way to tell the GC to allocate an object
of type T.  Perhaps Andrei's allocators will sort this out and
the issue will be moot.  For reference:

http://lists.puremagic.com/pipermail/d-runtime/2010-August/000075.html
http://lists.puremagic.com/pipermail/d-runtime/2012-April/001095.html
http://lists.puremagic.com/pipermail/d-runtime/2013-July/001840.html

Jun 27 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - GC.calloc(), then what?