www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Greedy memory handling

reply "monarch_dodra" <monarchdodra gmail.com> writes:
I have a function that will *massively* benefit from having a 
persistent internal buffer it can re-use (and grow) from call to 
call, instead of re-allocating on every call.

What I don't want is either of:
1. To set a fixed limitation of size, if the user ends up making 
repeated calls to something larger to my fixed size.
2. For a single big call which will allocate a HUGE internal 
buffer that will consume all my memory.

What I need is some sort of lazy buffer. Basically, the 
allocation holds, but I don't want the to prevent the GC from 
collecting it if it deems it has gotten too big, or needs more 
memory.

Any idea on how to do something like that? Or literature?
Sep 11 2013
next sibling parent reply "Gary Willoughby" <dev nomad.so> writes:
On Wednesday, 11 September 2013 at 08:06:37 UTC, monarch_dodra 
wrote:
 I have a function that will *massively* benefit from having a 
 persistent internal buffer it can re-use (and grow) from call 
 to call, instead of re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up 
 making repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal 
 buffer that will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the 
 allocation holds, but I don't want the to prevent the GC from 
 collecting it if it deems it has gotten too big, or needs more 
 memory.

 Any idea on how to do something like that? Or literature?
I've done something similar before and the general rule then was to start with a small buffer and if you need more just double it. So start with something like 4k(?) (depending on what you need) and before each call make sure you have enough, if not double the buffer by reallocating. This way you grow the buffer but only when needed. Also doubling makes sure you are not reallocating for each call. Take a look in the core.memory runtime file for the GC methods. The ones of interest for you are: GC.alloc(size) and GC.realloc(*buffer, newSize) or GC.extend(*buffer, minSize, desiredSize). You can then let the GC handle it or free it yourself with GC.free(*buffer).
Sep 11 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 11 September 2013 at 10:28:37 UTC, Gary Willoughby 
wrote:
 You can then let the GC handle it or free it yourself with 
 GC.free(*buffer).
But if the buffer is stored in a static variable, the GC will never collect it. I *could* also free it myself, but why/when would I do that? Did you just just let your buffer grow, and never let it get collected? Is there a way to do something like "I'm using this buffer, but if you want to collect it, then go ahead. I'll reallocate a new one *if/when* I need it again"
Sep 11 2013
next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/09/13 12:34, monarch_dodra wrote:
 But if the buffer is stored in a static variable, the GC will never collect it.
 I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get collected?

 Is there a way to do something like "I'm using this buffer, but if you want to
 collect it, then go ahead. I'll reallocate a new one *if/when* I need it again"
How about GC.addRoot and GC.removeRoot ... ?
Sep 11 2013
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/09/13 13:14, Joseph Rushton Wakeling wrote:
 On 11/09/13 12:34, monarch_dodra wrote:
 But if the buffer is stored in a static variable, the GC will never collect it.
 I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get collected?

 Is there a way to do something like "I'm using this buffer, but if you want to
 collect it, then go ahead. I'll reallocate a new one *if/when* I need it again"
How about GC.addRoot and GC.removeRoot ... ?
I should clarify that a bit more. I mean, from what I understand, you want to be able to do something like this: void foo(/* vars */) { // 1. if buffer not allocated, allocate as necessary // 2. send GC a message: "Hey, I'm using this buffer! Don't free! // 3. carry out your calculations // 4. send GC a message: "Hey, this buffer can be freed if you need to." } If I understand right, GC.addRoot should take care of (2) and GC.removeRoot can take care of (3). Then, if there's a collection cycle in-between calls to foo, fine; if not, next time you enter foo(), the new call to GC.addRoot will protect the memory for the lifetime of the calculation. But this is conjecture, not speaking from experience :-)
Sep 11 2013
parent reply "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 11 September 2013 at 11:19:27 UTC, Joseph Rushton
Wakeling wrote:
 On 11/09/13 13:14, Joseph Rushton Wakeling wrote:
 On 11/09/13 12:34, monarch_dodra wrote:
 But if the buffer is stored in a static variable, the GC will 
 never collect it.
 I *could* also free it myself, but why/when would I do that?

 Did you just just let your buffer grow, and never let it get 
 collected?

 Is there a way to do something like "I'm using this buffer, 
 but if you want to
 collect it, then go ahead. I'll reallocate a new one 
 *if/when* I need it again"
How about GC.addRoot and GC.removeRoot ... ?
I should clarify that a bit more. I mean, from what I understand, you want to be able to do something like this: void foo(/* vars */) { // 1. if buffer not allocated, allocate as necessary // 2. send GC a message: "Hey, I'm using this buffer! Don't free! // 3. carry out your calculations // 4. send GC a message: "Hey, this buffer can be freed if you need to." } If I understand right, GC.addRoot should take care of (2) and GC.removeRoot can take care of (3). Then, if there's a collection cycle in-between calls to foo, fine; if not, next time you enter foo(), the new call to GC.addRoot will protect the memory for the lifetime of the calculation. But this is conjecture, not speaking from experience :-)
That's somewhat better, as it would allow the GC to collect my buffer, if it wants to, but I wouldn't actually know about it afterwards which leaves me screwed. I *think* addRoot and removeRoot is really designed to pass GC memory to functions that aren't GC-scanned...
Sep 11 2013
parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/09/13 15:13, monarch_dodra wrote:
 That's somewhat better, as it would allow the GC to collect my buffer, if it
 wants to, but I wouldn't actually know about it afterwards which leaves me
screwed.
Just to clarify, is this buffer meant only for internal use in your function or is it meant to be externally accessed as well? I'd kind of assumed the former. Either way, isn't it sufficient to have some kind of if (buf is null) { // allocate the buffer } check in place? The basic model seems right -- at the moment when you need the buffer, you check if it's allocated (and if not, allocate it as needed); you indicate to the GC that it shouldn't collect the memory; you use the buffer; and the moment it's no longer needed, you indicate to the GC that it's collectable again. It means having to be very careful to check the buffer's allocation status whenever you want to use it, but I think that's an unavoidable consequence of wanting a static variable that can be freed if needed. The alternative I thought of was something like comparing the size difference between the currently-needed buffer and the last-needed buffer (... or if you want to be over-the-top, compare to a running average:-), and if the current one is sufficiently smaller, free the old one and re-alloc a new one; but that's a bit _too_ greedy in the free-up-memory stakes, I think.
Sep 11 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11-Sep-2013 17:33, Joseph Rushton Wakeling пишет:
 On 11/09/13 15:13, monarch_dodra wrote:
 That's somewhat better, as it would allow the GC to collect my buffer,
 if it
 wants to, but I wouldn't actually know about it afterwards which
 leaves me screwed.
Just to clarify, is this buffer meant only for internal use in your function or is it meant to be externally accessed as well? I'd kind of assumed the former. Either way, isn't it sufficient to have some kind of if (buf is null) { // allocate the buffer } check in place? The basic model seems right -- at the moment when you need the buffer, you check if it's allocated (and if not, allocate it as needed); you indicate to the GC that it shouldn't collect the memory; you use the buffer; and the moment it's no longer needed, you indicate to the GC that it's collectable again. It means having to be very careful to check the buffer's allocation status whenever you want to use it, but I think that's an unavoidable consequence of wanting a static variable that can be freed if needed.
Problem is - said GC-freed memory could be then reused in some way. I can't imagine how you'd test that the block that is allocated is *still your old* block. -- Dmitry Olshansky
Sep 11 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 11/09/13 15:45, Dmitry Olshansky wrote:
 Problem is - said GC-freed memory could be then reused in some way. I can't
 imagine how you'd test that the block that is allocated is *still your old*
block.
Ahh, nasty. I'd assumed that the buffer would have been reset to null in the event that the GC freed its memory.
Sep 11 2013
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Wednesday, 11 September 2013 at 13:33:23 UTC, Joseph Rushton 
Wakeling wrote:
 On 11/09/13 15:13, monarch_dodra wrote:
 That's somewhat better, as it would allow the GC to collect my 
 buffer, if it
 wants to, but I wouldn't actually know about it afterwards 
 which leaves me screwed.
Just to clarify, is this buffer meant only for internal use in your function or is it meant to be externally accessed as well? I'd kind of assumed the former. Either way, isn't it sufficient to have some kind of if (buf is null) { // allocate the buffer } check in place? The basic model seems right -- at the moment when you need the buffer, you check if it's allocated (and if not, allocate it as needed); you indicate to the GC that it shouldn't collect the memory; you use the buffer; and the moment it's no longer needed, you indicate to the GC that it's collectable again. It means having to be very careful to check the buffer's allocation status whenever you want to use it, but I think that's an unavoidable consequence of wanting a static variable that can be freed if needed. The alternative I thought of was something like comparing the size difference between the currently-needed buffer and the last-needed buffer (... or if you want to be over-the-top, compare to a running average:-), and if the current one is sufficiently smaller, free the old one and re-alloc a new one; but that's a bit _too_ greedy in the free-up-memory stakes, I think.
The buffer is meant strictly for internal use. It never escapes the function it is used in, which not re-entrant either. Basically, I'm storing the buffer in a "static ubyte[]", and if there isn't enough room for what I'm doing, I simply make it grow. No problems there. The issue I'm trying to solve is "and the moment it's no longer needed" part. The function is really just a free function, in a library. The user could use it ever only once, or use it very repeatedly, I don't know. I particular, the amount of buffer needed has a 1:1 correlation with the user's input size. The user could repeatedly call me with input in the size of a couple of bytes, or just once or twice with input in the megabytes. I *could* just allocate and forget about it, but I was curious about having a mechanism where the buffer would just be "potentially collected" between two calls. As a form of "failsafe" if it got too greedy, or if the user just hasn't used the function in a while.
Sep 11 2013
prev sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
11-Sep-2013 14:34, monarch_dodra пишет:
 On Wednesday, 11 September 2013 at 10:28:37 UTC, Gary Willoughby wrote:
 You can then let the GC handle it or free it yourself with
 GC.free(*buffer).
But if the buffer is stored in a static variable, the GC will never collect it. I *could* also free it myself, but why/when would I do that? Did you just just let your buffer grow, and never let it get collected? Is there a way to do something like "I'm using this buffer, but if you want to collect it, then go ahead. I'll reallocate a new one *if/when* I need it again"
You need weak references. With manually registered finalize for your buffer + flag you might pull it off (but be extremely careful). There is something like this in an upcoming std.signals2 IIRC. Basically the sequence should be - pin the pointer with strong ref if it's valid, use it, unpin. If it wasn't valid - it got collected, allocate new buffer and repeat. The "was valid" is the ugly part, and prone to race condition (GC works in its own thread, got to disable/enable etc.). All in all this is the kind of stuff that: a) Druntime/Phobos should provided b) Is actually needed for other things as well I'd file an enhancement if there isn't one already. -- Dmitry Olshansky
Sep 11 2013
prev sibling next sibling parent "Namespace" <rswhite4 googlemail.com> writes:
On Wednesday, 11 September 2013 at 08:06:37 UTC, monarch_dodra 
wrote:
 I have a function that will *massively* benefit from having a 
 persistent internal buffer it can re-use (and grow) from call 
 to call, instead of re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up 
 making repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal 
 buffer that will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the 
 allocation holds, but I don't want the to prevent the GC from 
 collecting it if it deems it has gotten too big, or needs more 
 memory.

 Any idea on how to do something like that? Or literature?
I do not know if it fits, but I had a similar problem some time ago: http://forum.dlang.org/thread/wsxajhlsupnraevowcgd forum.dlang.org
Sep 11 2013
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-09-11 10:06, monarch_dodra wrote:
 I have a function that will *massively* benefit from having a persistent
 internal buffer it can re-use (and grow) from call to call, instead of
 re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up making
 repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal buffer that
 will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the allocation
 holds, but I don't want the to prevent the GC from collecting it if it
 deems it has gotten too big, or needs more memory.

 Any idea on how to do something like that? Or literature?
How about keeping a stack or static buffer. If that gets too small use a new buffer. When you're done with the new buffer set it to null to allow the GC to collect it. Then repeat. -- /Jacob Carlborg
Sep 11 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Sep 12, 2013 at 08:27:59AM +0200, Jacob Carlborg wrote:
 On 2013-09-11 10:06, monarch_dodra wrote:
I have a function that will *massively* benefit from having a
persistent internal buffer it can re-use (and grow) from call to
call, instead of re-allocating on every call.

What I don't want is either of:
1. To set a fixed limitation of size, if the user ends up making
repeated calls to something larger to my fixed size.
2. For a single big call which will allocate a HUGE internal buffer
that will consume all my memory.

What I need is some sort of lazy buffer. Basically, the allocation
holds, but I don't want the to prevent the GC from collecting it if
it deems it has gotten too big, or needs more memory.

Any idea on how to do something like that? Or literature?
How about keeping a stack or static buffer. If that gets too small use a new buffer. When you're done with the new buffer set it to null to allow the GC to collect it. Then repeat.
[...] The problem is, he wants to reuse the buffer next time if the GC hasn't collected it yet. Here's an idea, though. It doesn't completely solve the problem, but it just occurred to me that "weak pointers" (i.e., ignored by the GC for the purposes of marking) can be simulated by XOR'ing the pointer value with some mask so that it's not recognized as a pointer by the GC. This can be encapsulated by a weak pointer struct that automatically does the translation: struct WeakPointer(T) { enum size_t mask = 0xdeadbeef; union Impl { T* ptr; size_t uintVal; } Impl impl; void set(T* ptr) system { impl.ptr = ptr; impl.uintVal ^= mask; } T* get() system { Impl i = impl; i.uintVal ^= mask; return i.ptr; } } WeakPointer!Buffer bufferRef; void doWork(Args...) { T* buffer; if (bufferRef.get() is null) { // Buffer hasn't been allocated yet buffer = allocateNewBuffer(); bufferRef.set(buffer); } else { void *p; core.memory.GC.getAttr(p); if (p is null || p != bufferRef.get()) { // GC has collected previous buffer buffer = allocateNewBuffer(); bufferRef.set(buffer); } } useBuffer(buffer); ... } Note that the inner if block is not 100% safe, because there's no guarantee that even if the base pointer of the block hasn't changed, the GC hasn't reallocated the block to somebody else. So this part is still yet to be solved. T -- It is widely believed that reinventing the wheel is a waste of time; but I disagree: without wheel reinventers, we would be still be stuck with wooden horse-cart wheels.
Sep 12 2013
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12-Sep-2013 17:51, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 08:27:59AM +0200, Jacob Carlborg wrote:
 On 2013-09-11 10:06, monarch_dodra wrote:
 I have a function that will *massively* benefit from having a
 persistent internal buffer it can re-use (and grow) from call to
 call, instead of re-allocating on every call.

 What I don't want is either of:
 1. To set a fixed limitation of size, if the user ends up making
 repeated calls to something larger to my fixed size.
 2. For a single big call which will allocate a HUGE internal buffer
 that will consume all my memory.

 What I need is some sort of lazy buffer. Basically, the allocation
 holds, but I don't want the to prevent the GC from collecting it if
 it deems it has gotten too big, or needs more memory.

 Any idea on how to do something like that? Or literature?
How about keeping a stack or static buffer. If that gets too small use a new buffer. When you're done with the new buffer set it to null to allow the GC to collect it. Then repeat.
[...] The problem is, he wants to reuse the buffer next time if the GC hasn't collected it yet. Here's an idea, though. It doesn't completely solve the problem, but it just occurred to me that "weak pointers" (i.e., ignored by the GC for the purposes of marking) can be simulated by XOR'ing the pointer value with some mask so that it's not recognized as a pointer by the GC. This can be encapsulated by a weak pointer struct that automatically does the translation: struct WeakPointer(T) { enum size_t mask = 0xdeadbeef; union Impl { T* ptr; size_t uintVal; } Impl impl; void set(T* ptr) system { impl.ptr = ptr; impl.uintVal ^= mask; } T* get() system { Impl i = impl; i.uintVal ^= mask; return i.ptr; } } WeakPointer!Buffer bufferRef; void doWork(Args...) { T* buffer; if (bufferRef.get() is null) { // Buffer hasn't been allocated yet buffer = allocateNewBuffer(); bufferRef.set(buffer); } else { void *p; core.memory.GC.getAttr(p);
This line above is not 100% good idea .. at least with deadbeaf as mask. If we do know what OS you compile for we may just flip the say upper bit and get a pointer into kernel space (and surely that isn't in GC pool). Even then your last paragraph pretty much destroys it. Better option is to have finalizer hooked up to set some flag. Then _after_ restoring the pointer we consult that flag variable.
 			if (p is null || p != bufferRef.get()) {
 				// GC has collected previous buffer
 				buffer = allocateNewBuffer();
 				bufferRef.set(buffer);
 			}
 		}
 		useBuffer(buffer);
 		...
 	}

 Note that the inner if block is not 100% safe, because there's no
 guarantee that even if the base pointer of the block hasn't changed, the
 GC hasn't reallocated the block to somebody else. So this part is still
 yet to be solved.


 T
-- Dmitry Olshansky
Sep 12 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 17:51, H. S. Teoh пишет:
[...]
	struct WeakPointer(T) {
		enum size_t mask = 0xdeadbeef;
		union Impl {
			T* ptr;
			size_t uintVal;
		}
		Impl impl;
		void set(T* ptr)  system {
			impl.ptr = ptr;
			impl.uintVal ^= mask;
		}
		T* get()  system {
			Impl i = impl;
			i.uintVal ^= mask;
			return i.ptr;
		}
	}

	WeakPointer!Buffer bufferRef;

	void doWork(Args...) {
		T* buffer;
		if (bufferRef.get() is null) {
			// Buffer hasn't been allocated yet
			buffer = allocateNewBuffer();
			bufferRef.set(buffer);
		} else {
			void *p;
			core.memory.GC.getAttr(p);
This line above is not 100% good idea .. at least with deadbeaf as mask. If we do know what OS you compile for we may just flip the say upper bit and get a pointer into kernel space (and surely that isn't in GC pool). Even then your last paragraph pretty much destroys it.
Well, that was just an example value. :) If we know which OS it is and how it assigns VM addresses, then we can adjust the mask appropriately. But yeah, calling GC.getAttr is unreliable since you can't tell whether the block is what you had before, or somebody else's new data. [...]
 Better option is to have finalizer hooked up to set some flag. Then
 _after_ restoring the pointer we consult that flag variable.
Good idea. The problem is, how to set a finalizer on a memory block that can change in size? The OP's original situation was that the buffer can be extended while in use, but I don't know of any D type that can associate a dtor with a ubyte[] array (note that the GC collecting the wrapper struct/class around the ubyte[] is not the same as collecting the actual memory block storing the ubyte[] -- the former can happen without the latter). T -- People tell me that I'm skeptical, but I don't believe it.
Sep 12 2013
parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
12-Sep-2013 20:51, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 17:51, H. S. Teoh пишет:
[...]
 	struct WeakPointer(T) {
 		enum size_t mask = 0xdeadbeef;
 		union Impl {
 			T* ptr;
 			size_t uintVal;
 		}
 		Impl impl;
 		void set(T* ptr)  system {
 			impl.ptr = ptr;
 			impl.uintVal ^= mask;
 		}
 		T* get()  system {
 			Impl i = impl;
 			i.uintVal ^= mask;
 			return i.ptr;
 		}
 	}

 	WeakPointer!Buffer bufferRef;

 	void doWork(Args...) {
 		T* buffer;
 		if (bufferRef.get() is null) {
 			// Buffer hasn't been allocated yet
 			buffer = allocateNewBuffer();
 			bufferRef.set(buffer);
 		} else {
 			void *p;
 			core.memory.GC.getAttr(p);
This line above is not 100% good idea .. at least with deadbeaf as mask. If we do know what OS you compile for we may just flip the say upper bit and get a pointer into kernel space (and surely that isn't in GC pool). Even then your last paragraph pretty much destroys it.
Well, that was just an example value. :) If we know which OS it is and how it assigns VM addresses, then we can adjust the mask appropriately. But yeah, calling GC.getAttr is unreliable since you can't tell whether the block is what you had before, or somebody else's new data.
It occured to me that there are modes where full address space is available, typically so on x86 app running on top of x64 kernel (e.g. in Windows Wow64 could do that, Linux also has so-called x32 ABI).
 [...]
 Better option is to have finalizer hooked up to set some flag. Then
 _after_ restoring the pointer we consult that flag variable.
Good idea. The problem is, how to set a finalizer on a memory block that can change in size? The OP's original situation was that the buffer can be extended while in use, but I don't know of any D type that can associate a dtor with a ubyte[] array (note that the GC collecting the wrapper struct/class around the ubyte[] is not the same as collecting the actual memory block storing the ubyte[] -- the former can happen without the latter).
Double indirection? Allocate a class that has finalizer, hold that via weak-ref. The wrapper in turn contains a pointer to the buffer. The interesting point then is that one may allocate said buffer via C's realloc. Then once helper struct is collected the finalizer is called and this is where we call free to cleanup C's heap. I'm thinking this actually is going to work. -- Dmitry Olshansky
Sep 12 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Sep 12, 2013 at 11:13:30PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 20:51, H. S. Teoh пишет:
On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
[...]
Better option is to have finalizer hooked up to set some flag. Then
_after_ restoring the pointer we consult that flag variable.
Good idea. The problem is, how to set a finalizer on a memory block that can change in size? The OP's original situation was that the buffer can be extended while in use, but I don't know of any D type that can associate a dtor with a ubyte[] array (note that the GC collecting the wrapper struct/class around the ubyte[] is not the same as collecting the actual memory block storing the ubyte[] -- the former can happen without the latter).
Double indirection? Allocate a class that has finalizer, hold that via weak-ref. The wrapper in turn contains a pointer to the buffer. The interesting point then is that one may allocate said buffer via C's realloc. Then once helper struct is collected the finalizer is called and this is where we call free to cleanup C's heap. I'm thinking this actually is going to work.
[...] Interesting idea, use C's malloc/realloc to hold the actual buffer. Only possible catch is, will that cause the GC to collect when it runs out of memory (which is the whole point of the OP's question)? I.e., does it make a difference in GC behaviour to allocate, say, 10MB from the GC vs. allocating 10MB from malloc/realloc? Assuming we have that settled, something like this should work: bool isValid; final class BufWrapper { void* ptrToMallocedBuf; this(void* ptr) { // We need this, 'cos otherwise we don't know if // our weak ref to BufWrapper is still valid! isValid = true; ptrToMallocedBuf = ptr; } ~this() { // If we're being collected, free the real // buffer too. free(ptrToMallocedBuf); isValid = false; } } // WeakPointer masks the pointer to BufWrapper in some suitable // way so that the GC will collect it when needed. WeakPointer!BufWrapper wrappedBufRef; void doWork(...) { void* buf; if (!isValid) { buf = realloc(null, bufSize); wrappedBufRef.set(buf); } else { buf = wrappedBufRef.get(); } // use buf here. } T -- Public parking: euphemism for paid parking. -- Flora
Sep 12 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
13-Sep-2013 00:11, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 11:13:30PM +0400, Dmitry Olshansky wrote:
 12-Sep-2013 20:51, H. S. Teoh пишет:
 On Thu, Sep 12, 2013 at 07:50:25PM +0400, Dmitry Olshansky wrote:
[...]
 Better option is to have finalizer hooked up to set some flag. Then
 _after_ restoring the pointer we consult that flag variable.
Good idea. The problem is, how to set a finalizer on a memory block that can change in size? The OP's original situation was that the buffer can be extended while in use, but I don't know of any D type that can associate a dtor with a ubyte[] array (note that the GC collecting the wrapper struct/class around the ubyte[] is not the same as collecting the actual memory block storing the ubyte[] -- the former can happen without the latter).
Double indirection? Allocate a class that has finalizer, hold that via weak-ref. The wrapper in turn contains a pointer to the buffer. The interesting point then is that one may allocate said buffer via C's realloc. Then once helper struct is collected the finalizer is called and this is where we call free to cleanup C's heap. I'm thinking this actually is going to work.
[...] Interesting idea, use C's malloc/realloc to hold the actual buffer. Only possible catch is, will that cause the GC to collect when it runs out of memory (which is the whole point of the OP's question)? I.e., does it make a difference in GC behaviour to allocate, say, 10MB from the GC vs. allocating 10MB from malloc/realloc?
The only problem I can foresee is that when it runs the collection (*and* being tight on RAM) the C heap will not return said chunk back to OS. Then GC won't pick up that memory, and we'd get out of ram. I would safely assume however that for big buffers a mmap/munmap is called (or its analogue) and hence memory is returned back to OS. That's what all allocators do for huge chunks by anyway. Otherwise we are still in a good shape, the memory will eventually be freed, yet we get to reuse it quite cheaply in a tight loop. I don't expect collections to run in these all that often ;)
 Assuming we have that settled, something like this should work:

 	bool isValid;
 	final class BufWrapper {
 		void* ptrToMallocedBuf;
 		this(void* ptr) {
 			// We need this, 'cos otherwise we don't know if
 			// our weak ref to BufWrapper is still valid!
 			isValid = true;

 			ptrToMallocedBuf = ptr;
 		}
 		~this() {
 			// If we're being collected, free the real
 			// buffer too.
 			free(ptrToMallocedBuf);
 			isValid = false;
 		}
 	}

 	// WeakPointer masks the pointer to BufWrapper in some suitable
 	// way so that the GC will collect it when needed.
 	WeakPointer!BufWrapper wrappedBufRef;

 	void doWork(...) {
 		void* buf;
Careful here - you really have first to get a pointer ... THEN check if it's valid.
 		if (!isValid) {
 			buf = realloc(null, bufSize);
 			wrappedBufRef.set(buf);
 		} else {
//otherwise at this point GC.collect runs and presto, memory is freed //too bad such a thing will never show up in unittests
 			buf = wrappedBufRef.get();
 		}

 		// use buf here.
 	}
Checking the flag should be somehow part of weak ref job. I'd rather make it less error prone: void* buf; //unmask pointer, do the flag check - false means was freed if(!weakRef.readTo(buf)){ //create & set new buf buf = realloc(...); } ... //use buf weakRef.set(buf); I think I'd code it up if nobody beats me to it as I need the same exact pattern for std.regex anyway. -- Dmitry Olshansky
Sep 12 2013
prev sibling parent "monarch_dodra" <monarchdodra gmail.com> writes:
On Thursday, 12 September 2013 at 19:13:40 UTC, Dmitry Olshansky
wrote:
 Double indirection? Allocate a class that has finalizer, hold 
 that via weak-ref. The wrapper in turn contains a pointer to 
 the buffer. The interesting point then is that one may allocate 
 said buffer via C's realloc.

 Then once helper struct is collected the finalizer is called 
 and this is where we call free to cleanup C's heap.

 I'm thinking this actually is going to work.
Yum. I like this. I was going to say: "At the end of the day, if the GC doesn't *tell* us the collection happened, then the problem is not solve-able. We'd need a way that would allow the GC to tell us the memory was *finalized*". And then I'd go on to say "since our GC is non-finalizing, there is simply no solution". But then classes. Derp. I'd be real interested in having a finalized solution. The "details" of how memory addressing is not my strong suite, so I wouldn't trust myself with all those union{ptr/size_t} things. Thanks, I'll start toying around with this :)
Sep 12 2013
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-09-12 15:51, H. S. Teoh wrote:

 The problem is, he wants to reuse the buffer next time if the GC hasn't
 collected it yet.
I was thinking he could reuse the stack/static buffer. Basically using two buffers, one static and one dynamic. -- /Jacob Carlborg
Sep 12 2013