www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Safely extend the size of a malloced memory block after realloc

reply "Benjamin Thaut" <code benjamin-thaut.de> writes:
Consider the following code

void* mem = malloc(500);
GC.addRange(mem, 500);
mem = realloc(mem, 512); // assume the pointer didn't change
GC.removeRange(mem);
// if the GC kicks in here we're f*****
GC.addRange(mem, 512);



I digged into GC.addRange to find out if I simply can skip the 
call to GC.removeRange when the pointer doesn't change and ended 
up with

void addRange(void *pbot, void *ptop, const TypeInfo ti) nothrow 
 nogc
{
     ranges.insert(Range(pbot, ptop));
}

where ranges is defined as: Treap!Range ranges

"insert" of Treap is implemented as:

Node* insert(Node* node, E element)  nogc
{
     if (!node)
         return allocNode(element);
     else if (element < node.element)
     {
         node.left = insert(node.left, element);
         if (node.left.priority < node.priority)
             node = rotateR(node);
     }
     else if (element > node.element)
     {
         node.right = insert(node.right, element);
         if (node.right.priority < node.priority)
             node = rotateL(node);
     }
     else
     {} // ignore duplicate

     return node;
}

The problem is the line that says "ignore duplicate". Because 
that way I can not safly update a GC range.

How are you supposed to safely update a GC range that changed its 
size but not its address? The documentation doesn't say anything 
about it.

Either I'm missing something or this is a bug.

Kind Regards
Benjamin Thaut
Aug 17 2015
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/17/15 3:27 PM, Benjamin Thaut wrote:
 Consider the following code

 void* mem = malloc(500);
 GC.addRange(mem, 500);
 mem = realloc(mem, 512); // assume the pointer didn't change
 GC.removeRange(mem);
This is actually unsafe, you have to remove the range first, or else if it *does* change the pointer, your GC is using free'd memory. Plus, if it does change the pointer, how do you remove the original range?
 // if the GC kicks in here we're f*****
 GC.addRange(mem, 512);
Can't you GC.disable around this whole thing? -Steve
Aug 17 2015
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 8/17/2015 12:38 PM, Steven Schveighoffer wrote:
 On 8/17/15 3:27 PM, Benjamin Thaut wrote:
 void* mem = malloc(500);
 GC.addRange(mem, 500);
 mem = realloc(mem, 512); // assume the pointer didn't change
 GC.removeRange(mem);
This is actually unsafe, you have to remove the range first, or else if it *does* change the pointer, your GC is using free'd memory. Plus, if it does change the pointer, how do you remove the original range?
Good catch, quite right.
 // if the GC kicks in here we're f*****
 GC.addRange(mem, 512);
Can't you GC.disable around this whole thing?
I agree that should work.
Aug 17 2015
prev sibling next sibling parent reply "Benjamin Thaut" <code benjamin-thaut.de> writes:
On Monday, 17 August 2015 at 19:38:21 UTC, Steven Schveighoffer 
wrote:
 On 8/17/15 3:27 PM, Benjamin Thaut wrote:
 Consider the following code

 void* mem = malloc(500);
 GC.addRange(mem, 500);
 mem = realloc(mem, 512); // assume the pointer didn't change
 GC.removeRange(mem);
This is actually unsafe, you have to remove the range first, or else if it *does* change the pointer, your GC is using free'd memory. Plus, if it does change the pointer, how do you remove the original range?
I specifically asked for the case where the pointer doesn't change. Obvisouly the case where it does change is easy, you first add the new range and then remove the old one. But if you do this and the pointer didn't change, the addRange doesn't do anything because its a duplicate and the removeRange then removes the range, because the pointer is still the same. You then end up with the GC not knowing anything about the range anymore.
 // if the GC kicks in here we're f*****
 GC.addRange(mem, 512);
Can't you GC.disable around this whole thing? -Steve
Yes, this would work, but It seems kind of broken to me, that you have to make 4 API Calls to the gc to handle something as simple as a realloc. Kind Regards Benjamin Thaut
Aug 17 2015
next sibling parent "crimaniak" <crimaniak gmail.com> writes:
On Tuesday, 18 August 2015 at 05:51:36 UTC, Benjamin Thaut wrote:

 I specifically asked for the case where the pointer doesn't 
 change. Obvisouly the case where it does change is easy, you 
 first add the new range and then remove the old one. But if you 
 do this and the pointer didn't change, the addRange doesn't do 
 anything because its a duplicate and the removeRange then 
 removes the range, because the pointer is still the same. You 
 then end up with the GC not knowing anything about the range 
 anymore.
May be implementing something like GC.resizeRange() will resolve such type of problem. Is it good idea or no?
Aug 18 2015
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/18/15 1:51 AM, Benjamin Thaut wrote:
 On Monday, 17 August 2015 at 19:38:21 UTC, Steven Schveighoffer wrote:
 On 8/17/15 3:27 PM, Benjamin Thaut wrote:
 Consider the following code

 void* mem = malloc(500);
 GC.addRange(mem, 500);
 mem = realloc(mem, 512); // assume the pointer didn't change
 GC.removeRange(mem);
This is actually unsafe, you have to remove the range first, or else if it *does* change the pointer, your GC is using free'd memory. Plus, if it does change the pointer, how do you remove the original range?
I specifically asked for the case where the pointer doesn't change. Obvisouly the case where it does change is easy, you first add the new range and then remove the old one. But if you do this and the pointer didn't change, the addRange doesn't do anything because its a duplicate and the removeRange then removes the range, because the pointer is still the same. You then end up with the GC not knowing anything about the range anymore.
In the case where the pointer changes, you are in trouble. The original memory is now free, which means it can be overwritten by something else (either the C heap or some other thread that reallocates it). Then if your GC runs *before* you have added the new memory, it may collect the now-no-longer-referred-to data. It's no different than your original situation. I actually think the case where the pointer changes is worse.
 // if the GC kicks in here we're f*****
 GC.addRange(mem, 512);
Can't you GC.disable around this whole thing?
Yes, this would work, but It seems kind of broken to me, that you have to make 4 API Calls to the gc to handle something as simple as a realloc.
First measure code in terms of correctness, before anything else. This is neither a "simple" situation, nor a common one -- the more obscure you get, the more low level you need to write your code. It may come down to the conclusion that using realloc for this just isn't a good idea, use something else. Also, I note that others have said one can call GC.collect from another thread, which is true. One could call GC.enable as well. If you have concerns of this happening (i.e. you don't control all the code, and think your code may coexist with something that calls GC.collect), the likely correct mechanism is to take the GC global lock while doing your operation. I'm not sure if you can do that via the current API, you may have to add such a feature. -Steve
Aug 19 2015
parent "Benjamin Thaut" <code benjamin-thaut.de> writes:
On Wednesday, 19 August 2015 at 14:45:31 UTC, Steven 
Schveighoffer wrote:
 In the case where the pointer changes, you are in trouble. The 
 original memory is now free, which means it can be overwritten 
 by something else (either the C heap or some other thread that 
 reallocates it). Then if your GC runs *before* you have added 
 the new memory, it may collect the now-no-longer-referred-to 
 data. It's no different than your original situation.

 I actually think the case where the pointer changes is worse.
Yes I made the same observation in the meantime.
 Also, I note that others have said one can call GC.collect from 
 another thread, which is true. One could call GC.enable as 
 well. If you have concerns of this happening (i.e. you don't 
 control all the code, and think your code may coexist with 
 something that calls GC.collect), the likely correct mechanism 
 is to take the GC global lock while doing your operation. I'm 
 not sure if you can do that via the current API, you may have 
 to add such a feature.
Yes I figured as much. The entire purpose of this thraed was to point out that you can not safely forward a realloc to the GC. Unfortunately its not a option not to use realloc as I'm binding some code I don't have control over. To summarize the entire issue and a possible solutions I created the following ticket: https://issues.dlang.org/show_bug.cgi?id=14934 Kind Regards Benjamin Thaut
Aug 19 2015
prev sibling parent reply "Casper =?UTF-8?B?RsOmcmdlbWFuZCI=?= <shorttail hotmail.com> writes:
On Monday, 17 August 2015 at 19:38:21 UTC, Steven Schveighoffer 
wrote:
 // if the GC kicks in here we're f*****
 GC.addRange(mem, 512);
Can't you GC.disable around this whole thing?
GC.collect can still be called from another thread.
Aug 18 2015
parent "Benjamin Thaut" <code benjamin-thaut.de> writes:
On Tuesday, 18 August 2015 at 10:27:14 UTC, Casper Færgemand 
wrote:
 On Monday, 17 August 2015 at 19:38:21 UTC, Steven Schveighoffer 
 wrote:
 // if the GC kicks in here we're f*****
 GC.addRange(mem, 512);
Can't you GC.disable around this whole thing?
GC.collect can still be called from another thread.
Good point, also GC.disable doesn't guarantee that the GC will never run. The documentation says that the GC may still run in out of memory or similar situations where it absolutely needs to run. So there isn't a safe way to do this after all.
Aug 18 2015
prev sibling parent reply "welkam" <wwwelkam gmail.com> writes:
// if the GC kicks in here we're f*****

Why?

static nothrow  nogc void removeRange(in void* p);
Removes the memory range starting at p from an internal list of 
ranges to be scanned during a collection. <...>
Aug 17 2015
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/17/15 3:57 PM, welkam wrote:
 // if the GC kicks in here we're f*****

 Why?

 static nothrow  nogc void removeRange(in void* p);
 Removes the memory range starting at p from an internal list of ranges
 to be scanned during a collection. <...>
Because presumably the reason why you have added the range is because it's not GC memory (as in this case). This means that if a GC run kicks in right then, that range of data will not be scanned, and the GC memory it may have been pointing at could potentially be freed. -Steve
Aug 17 2015
parent reply "welkam" <wwwelkam gmail.com> writes:
On Monday, 17 August 2015 at 20:07:08 UTC, Steven Schveighoffer 
wrote:
 On 8/17/15 3:57 PM, welkam wrote:
 // if the GC kicks in here we're f*****

 Why?

 static nothrow  nogc void removeRange(in void* p);
 Removes the memory range starting at p from an internal list 
 of ranges
 to be scanned during a collection. <...>
Because presumably the reason why you have added the range is because it's not GC memory (as in this case). This means that if a GC run kicks in right then, that range of data will not be scanned, and the GC memory it may have been pointing at could potentially be freed. -Steve
I might be wrong, but he should worry about GC before he removes that memory range from GC managed list not after. And his code smells to me. He gives full memory control to GC, but then wants to take it away, fiddle and give it back. I would allocate more than I need to so avoiding a need to extend the memory. If not then either allocate new chunk of memory, copy data and forget about old one or have a data structure where I could add new chunks of memory. Its best to minimise system calls such as malloc and similar because they hit OS and slow down execution.
Aug 17 2015
parent "Benjamin Thaut" <code benjamin-thaut.de> writes:
On Monday, 17 August 2015 at 20:33:45 UTC, welkam wrote:
 I might be wrong, but he should worry about GC before he 
 removes that memory range from GC managed list not after. And 
 his code smells to me. He gives full memory control to GC, but 
 then wants to take it away, fiddle and give it back. I would 
 allocate more than I need to so avoiding a need to extend the 
 memory. If not then either allocate new chunk of memory, copy 
 data and forget about old one or have a data structure where I 
 could add new chunks of memory.
 Its best to minimise system calls such as malloc and similar 
 because they hit OS and slow down execution.
The memory in question is never controlled by the GC. But it may contain pointers into GC memory. You need to update your view on what addRange and removeRange does. Also when you bind a different language to D you don't have the option to change the code, you have to work with whats already there. Kind Regards Benjamin Thaut
Aug 17 2015